diff --git a/README.md b/README.md
index 110697c..2251e1d 100644
--- a/README.md
+++ b/README.md
@@ -3,11 +3,102 @@ CUDA Path Tracer
 
 **University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 3**
 
-* (TODO) YOUR NAME HERE
-* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
+* Name: Zhan Xiong Chin
+* Tested on: Windows 7 Professional, Intel(R) Xeon(R) CPU E5-1630 v4 @ 3.70 GHz 3.70 GHz, GTX 1070 8192MB (SIG Lab)
 
-### (TODO: Your README)
+## Overview
 
-*DO NOT* leave the README to the last minute! It is a crucial part of the
-project, and we will not be able to grade you without a good README.
+![](img/cornell_example.5000samp.png)
 
+5000 iterations, ~1h rendering time
+
+GPU-based path tracer. Renders by casting rays from the camera and bouncing them around the scene until they hit a light. This path-tracer implements the core features:
+
+* Path compaction of out-of-scene rays
+* First bounce caching
+* Compaction of same-material rays
+* Diffuse and perfect specular (mirrored) materials
+
+Furthermore, this path-tracer implements the following additional features:
+
+* Refractive materials
+	* Includes Fresnel effects using Schlick's approximation
+* Imperfect specular materials
+* Stochastic sampling for anti-aliasing
+* Model loading from .OBJ files
+	* Also includes automatic partitioning of model into sub-models
+
+## Build Instructions
+
+[See here](https://github.com/CIS565-Fall-2016/Project0-CUDA-Getting-Started/blob/master/INSTRUCTION.md)
+
+## Diffuse and specular materials
+
+Each time a ray hits an object, it has a 50-50 chance of either doing a diffuse bounce or a specular bounce. A diffuse bounce is chosen from a hemisphere normal to the plane of intersection using a cosine-weighted function. On the other hand, a specular bounce uses the formulas from [*GPU Gems 3*, Chapter 20](http://http.developer.nvidia.com/GPUGems3/gpugems3_ch20.html) to generate a random specular ray. With increasing specular exponent, the potential range of the specular bounce gets narrower, forming a sharper reflective image. 
+
+![](img/specular_and_diffuse.5000samp.png)
+
+From left to right: diffuse, specular exponent 25, specular exponent 100, perfectly reflective
+
+It may be possible to achieve some additional speed-up in convergence of the image by tweaking the bounce probability. In particular, a material with a higher specular exponent may be able to render correctly with a smaller proportion of specular bounces, since the variance in angle of the reflected ray is smaller.
+
+## Refraction
+
+Refractive objects do not make use of diffuse bounces. Instead, the exact reflection or refraction of the ray is calculated. [Schlick's approximation](https://en.wikipedia.org/wiki/Schlick%27s_approximation) is used to calculate Fresnel effects (i.e. partial reflection).
+
+![](img/refractive.5000samp.png)
+
+Refractive indices from left to right: 1.2, 1.6, 2.4.
+
+As compared to a CPU, refraction may impact performance on the GPU slightly, as having a refractive material causes additional branching to occur in the shader kernel. In principle, it might be possible to optimize this by groupin rays of the same material together (see below), though in practice this is not likely to be the case.
+
+## Anti-aliasing
+
+Stochastic sampling is used for anti-aliasing. In other words, each pixel is split up into some number of subpixels, and the color is taken as an average of random rays from each subpixel.
+
+Below, the image on the left is anti-aliased (4 sub-pixels per pixel, 5000 iterations), whereas the image on the right is not anti-aliased (20000 iterations). The effect of anti-aliasing can be clearly seen on the borders of the sphere as well as the base of the wall.
+
+![](img/crop_cornell.antialiased.5000samp.png) ![](img/crop_cornell_aliased.20000samp.png)
+
+If the number of iterations is scaled correspondingly (e.g. as above), anti-aliasing can result in slightly higher performance, as there need to be less kernel launches and more ray intersections can be calculated in parallel. From experimentation, the anti-aliased image seems to take approximately 5-10% faster to render than the non-anti-aliased one.
+
+ However, this does not work together with first-bounce caching, since the first bounce is no longer deterministic. Thus, the overall impact on performance is likely to end up the same. If non-stochastic anti-aliasing were used instead (e.g. using the same ray for each sub-pixel), we could still preserve the benefits of both. This is also a feature that scales well on the GPU, since increasing the number of rays parallelizes better than on a CPU.
+
+
+## Model loading and rendering
+
+[tinyObjLoader](http://syoyo.github.io/tinyobjloader/) was used for model loading.
+
+For testing, the Utah teapot model was used (~22000 triangles). Due to the large number of triangle-ray intersections required, performance slowed down significantly. Two methods were used to improve performance. 
+
+![](img/teapot.3000samp.png)
+
+Firstly, box-ray intersections are performed by first intersecting each ray with each model's bounding box in parallel, then compacting the pairs which collide. Then, for each box-ray intersection, a kernel loops through all triangles in the model and calculates the nearest intersection. This significantly reduces the number of ray-model intersections that need to be computed.
+
+Secondly, the model loader also automatically partitions a model into a n^3 grid of submodels by bounding box, to reduce the number of collisions needed. A 2-by-2-by-2 grid seems to give the best speed-up for the teapot model used for testing; it balances the increase in bounding-box intersection time against the speedup in triangle intersection time the best.
+
+![](img/graph_2x2x2_bb_cull.png)
+
+![](img/graph_bb_grid.png)
+
+The above graph is a breakdown of time spent in each kernel for 1x1x1, 2x2x2, and 3x3x3 grids, for the scene at the very beginning of the readme. In particular, note that the most time is spent on triangle intersection and bounding box intersection. Increasing the grid side length from 1 to 2 improves performance by decreasing the time spent on triangle intersections, but it would appear that after this point the benefits no longer accrue. On the other hand, the time taken for bounding box intersection increases in a cubic fashion, as expected from the implementation. 
+
+The GPU version is much faster than the CPU version, since ray-triangle or ray-box intersections are embarrassingly parallel. However, one speed-up that would likely improve performance significantly is the construction of an octree over the model or scene (rather than just a single-layered n^3 grid). Alternatively, it could be possible to continue using the single-layered grid, but perform iterative traversal rather than intersecting all n^3 grid cells against a ray. It is possible to calculate both the initial grid cell the ray hits, as well as the next grid cell it would go to, iteratively without significant branching. 
+
+While both of the above speed-ups are much harder to implement on the GPU as compared to the CPU, they would allow the time needed for bounding box intersection to scale linearly rather than cubically, while decreasing triangle intersection time for complex models. Since most of the time is spent in these two kernels, performance would improve significantly if either one is successfully implemented.
+
+## Path compaction
+
+Path compaction achieves a significant speed-up, especially when there are a large number of polygons in a scene. As can be seen from the graph in the previous section, the number of rays drops significantly as the depth decreases. This has a major impact on performance for open scenes (i.e. scenes where rays can escape), but would not be seen for closed scenes (e.g. a closed room).
+
+## First-bounce caching
+
+For the non anti-aliased implementation, the first bounce of the ray can be cached, as it is deterministic. This improves performance significantly in complex scenes.
+
+![](img/graph_firstbounce_caching.png)
+
+For example, the teapot model above was rendered with and without first bounce caching over a number of depths. Caching consistently improved the performance of the rendering by 10%. This is because the initial bounce calculation is the most expensive, having the largest number of rays to intersect. 
+
+## Grouping by material
+
+It is also possible to group rays that intersect the same material to make them contiguous in memory before shading. However, from the kernel breakdown above, the main bottleneck for the renderer is not the shading, but rather the intersection computation. Thus, it is unlikely that this would have a positive impact. Implementing this in the above scene led to a 20% slowdown in rendering, supporting this hypothesis.
\ No newline at end of file
diff --git a/img/cornell.2016-10-04_02-42-08z.5000samp.png b/img/cornell.2016-10-04_02-42-08z.5000samp.png
new file mode 100644
index 0000000..f55c048
Binary files /dev/null and b/img/cornell.2016-10-04_02-42-08z.5000samp.png differ
diff --git a/img/cornell.2016-10-04_02-51-15z.5000samp.png b/img/cornell.2016-10-04_02-51-15z.5000samp.png
new file mode 100644
index 0000000..25645ff
Binary files /dev/null and b/img/cornell.2016-10-04_02-51-15z.5000samp.png differ
diff --git a/img/cornell.antialiased.5000samp.png b/img/cornell.antialiased.5000samp.png
new file mode 100644
index 0000000..c0b1280
Binary files /dev/null and b/img/cornell.antialiased.5000samp.png differ
diff --git a/img/cornell_aliased.20000samp.png b/img/cornell_aliased.20000samp.png
new file mode 100644
index 0000000..063ad59
Binary files /dev/null and b/img/cornell_aliased.20000samp.png differ
diff --git a/img/cornell_example.5000samp.png b/img/cornell_example.5000samp.png
new file mode 100644
index 0000000..cf82cf7
Binary files /dev/null and b/img/cornell_example.5000samp.png differ
diff --git a/img/cornell_teapot.5000samp.png b/img/cornell_teapot.5000samp.png
new file mode 100644
index 0000000..84485ee
Binary files /dev/null and b/img/cornell_teapot.5000samp.png differ
diff --git a/img/crop_cornell.antialiased.5000samp.png b/img/crop_cornell.antialiased.5000samp.png
new file mode 100644
index 0000000..a4ee3f1
Binary files /dev/null and b/img/crop_cornell.antialiased.5000samp.png differ
diff --git a/img/crop_cornell_aliased.20000samp.png b/img/crop_cornell_aliased.20000samp.png
new file mode 100644
index 0000000..cbe8ef2
Binary files /dev/null and b/img/crop_cornell_aliased.20000samp.png differ
diff --git a/img/graph_2x2x2_bb_cull.png b/img/graph_2x2x2_bb_cull.png
new file mode 100644
index 0000000..049fd43
Binary files /dev/null and b/img/graph_2x2x2_bb_cull.png differ
diff --git a/img/graph_bb_grid.png b/img/graph_bb_grid.png
new file mode 100644
index 0000000..335c713
Binary files /dev/null and b/img/graph_bb_grid.png differ
diff --git a/img/graph_firstbounce_caching.png b/img/graph_firstbounce_caching.png
new file mode 100644
index 0000000..87812df
Binary files /dev/null and b/img/graph_firstbounce_caching.png differ
diff --git a/img/graph_raycompaction.png b/img/graph_raycompaction.png
new file mode 100644
index 0000000..f3bada0
Binary files /dev/null and b/img/graph_raycompaction.png differ
diff --git a/img/imperfect_specular_5000samp.png b/img/imperfect_specular_5000samp.png
new file mode 100644
index 0000000..e0e6be0
Binary files /dev/null and b/img/imperfect_specular_5000samp.png differ
diff --git a/img/refractive.5000samp.png b/img/refractive.5000samp.png
new file mode 100644
index 0000000..d70130a
Binary files /dev/null and b/img/refractive.5000samp.png differ
diff --git a/img/specular_and_diffuse.5000samp.png b/img/specular_and_diffuse.5000samp.png
new file mode 100644
index 0000000..15d9d81
Binary files /dev/null and b/img/specular_and_diffuse.5000samp.png differ
diff --git a/img/specular_and_refractive.5000samp.png b/img/specular_and_refractive.5000samp.png
new file mode 100644
index 0000000..d0322cc
Binary files /dev/null and b/img/specular_and_refractive.5000samp.png differ
diff --git a/img/teapot.3000samp.png b/img/teapot.3000samp.png
new file mode 100644
index 0000000..29bd10b
Binary files /dev/null and b/img/teapot.3000samp.png differ
diff --git a/scenes/cornell.txt b/scenes/cornell.txt
index 83ff820..b316f3d 100644
--- a/scenes/cornell.txt
+++ b/scenes/cornell.txt
@@ -6,7 +6,7 @@ SPECRGB     0 0 0
 REFL        0
 REFR        0
 REFRIOR     0
-EMITTANCE   5
+EMITTANCE   20
 
 // Diffuse white
 MATERIAL 1
@@ -41,8 +41,58 @@ EMITTANCE   0
 // Specular white
 MATERIAL 4
 RGB         .98 .98 .98
+SPECEX      50
+SPECRGB     .8 .8 .8
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Specular red
+MATERIAL 5
+RGB         .85 .35 .35
+SPECEX      10
+SPECRGB     .4 .4 .4
+REFL        1
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Glass
+MATERIAL 6
+RGB         0.9 0.9 0.9
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        1
+REFRIOR     2
+EMITTANCE   0
+
+// Mirror
+MATERIAL 7
+RGB         0.9 0.9 0.9
 SPECEX      0
-SPECRGB     .98 .98 .98
+SPECRGB     0 0 0
+REFL        1
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse blue
+MATERIAL 8
+RGB         .35 .35 .85
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Grey mirror
+MATERIAL 9
+RGB         0.4 0.4 0.4 
+SPECEX      30
+SPECRGB     0.98 0.98 0.98
 REFL        1
 REFR        0
 REFRIOR     0
@@ -108,10 +158,47 @@ TRANS       5 5 0
 ROTAT       0 0 0
 SCALE       .01 10 10
 
-// Sphere
+// Teapot
 OBJECT 6
-sphere
-material 4
-TRANS       -1 4 -1
+mesh
+mesh 0
+material 5
+TRANS       3 0 0
 ROTAT       0 0 0
-SCALE       3 3 3
+SCALE       0.02 0.02 0.02
+
+// Glass Ball
+OBJECT 7
+sphere
+material 6
+TRANS -0.8 1.5 2.75
+ROTAT 0 0 0
+SCALE 2 2 2
+
+// Mirror ball
+OBJECT 8
+sphere
+material 7
+TRANS 3 1 3
+ROTAT 0 0 0
+SCALE 2 2 2
+
+// Diffuse ball
+OBJECT 9
+sphere
+material 8
+TRANS -3 1 4
+ROTAT 0 0 0
+SCALE 2 2 2
+
+// Specular ball
+OBJECT 10
+sphere
+material 9
+TRANS -2 1 1
+ROTAT 0 0 0
+SCALE 2 2 2
+
+// Teapot
+MESH 0
+scenes/teapot.obj
\ No newline at end of file
diff --git a/scenes/cornell2.txt b/scenes/cornell2.txt
new file mode 100644
index 0000000..632a0a6
--- /dev/null
+++ b/scenes/cornell2.txt
@@ -0,0 +1,204 @@
+// Emissive material (light)
+MATERIAL 0
+RGB         1 1 1
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   20
+
+// Diffuse white
+MATERIAL 1
+RGB         .98 .98 .98
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse red
+MATERIAL 2
+RGB         .85 .35 .35
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse green
+MATERIAL 3
+RGB         .35 .85 .35
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Specular white
+MATERIAL 4
+RGB         .98 .98 .98
+SPECEX      50
+SPECRGB     .8 .8 .8
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Specular red
+MATERIAL 5
+RGB         .85 .35 .35
+SPECEX      10
+SPECRGB     .4 .4 .4
+REFL        1
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Glass
+MATERIAL 6
+RGB         0.9 0.9 0.9
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        1
+REFRIOR     2
+EMITTANCE   0
+
+// Mirror
+MATERIAL 7
+RGB         0.9 0.9 0.9
+SPECEX      0
+SPECRGB     0 0 0
+REFL        1
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse blue
+MATERIAL 8
+RGB         .35 .35 .85
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Grey mirror
+MATERIAL 9
+RGB         0.4 0.4 0.4 
+SPECEX      30
+SPECRGB     0.98 0.98 0.98
+REFL        1
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Camera
+CAMERA
+RES         800 800
+FOVY        45
+ITERATIONS  10
+DEPTH       8
+FILE        cornell
+EYE         0.0 5 10.5
+LOOKAT      0 5 0
+UP          0 1 0
+
+
+// Ceiling light
+OBJECT 0
+cube
+material 0
+TRANS       0 10 0
+ROTAT       0 0 0
+SCALE       3 .3 3
+
+// Floor
+OBJECT 1
+cube
+material 1
+TRANS       0 0 0
+ROTAT       0 0 0
+SCALE       10 .01 10
+
+// Ceiling
+OBJECT 2
+cube
+material 1
+TRANS       0 10 0
+ROTAT       0 0 90
+SCALE       .01 10 10
+
+// Back wall
+OBJECT 3
+cube
+material 1
+TRANS       0 5 -5
+ROTAT       0 90 0
+SCALE       .01 10 10
+
+// Left wall
+OBJECT 4
+cube
+material 2
+TRANS       -5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+// Right wall
+OBJECT 5
+cube
+material 3
+TRANS       5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+// Teapot
+OBJECT 6
+mesh
+mesh 0
+material 5
+TRANS       3 0 0
+ROTAT       0 0 0
+SCALE       0.02 0.02 0.02
+
+// Glass Ball
+OBJECT 7
+sphere
+material 6
+TRANS -0.8 1.5 2.75
+ROTAT 0 0 0
+SCALE 2 2 2
+
+// Mirror ball
+OBJECT 8
+sphere
+material 7
+TRANS 3 1 3
+ROTAT 0 0 0
+SCALE 2 2 2
+
+// Diffuse ball
+OBJECT 9
+sphere
+material 8
+TRANS -3 1 4
+ROTAT 0 0 0
+SCALE 2 2 2
+
+// Specular ball
+OBJECT 10
+sphere
+material 9
+TRANS -2 1 1
+ROTAT 0 0 0
+SCALE 2 2 2
+
+// Teapot
+MESH 0
+scenes/teapot.obj
\ No newline at end of file
diff --git a/scenes/cornell_ball.txt b/scenes/cornell_ball.txt
new file mode 100644
index 0000000..2483a2d
--- /dev/null
+++ b/scenes/cornell_ball.txt
@@ -0,0 +1,136 @@
+// Emissive material (light)
+MATERIAL 0
+RGB         1 1 1
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   20
+
+// Diffuse white
+MATERIAL 1
+RGB         .98 .98 .98
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse red
+MATERIAL 2
+RGB         .85 .35 .35
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse green
+MATERIAL 3
+RGB         .35 .85 .35
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Specular white
+MATERIAL 4
+RGB         .98 .98 .98
+SPECEX      10
+SPECRGB     .98 .98 .98
+REFL        1
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Specular red
+MATERIAL 5
+RGB         .85 .35 .35
+SPECEX      10
+SPECRGB     .98 .98 .98
+REFL        1
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Glass
+MATERIAL 6
+RGB         0.9 0.9 0.9
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        1
+REFRIOR     3
+EMITTANCE   0
+// Camera
+CAMERA
+RES         800 800
+FOVY        45
+ITERATIONS  5000
+DEPTH       8
+FILE        cornell
+EYE         0.0 5 10.5
+LOOKAT      0 5 0
+UP          0 1 0
+
+
+// Ceiling light
+OBJECT 0
+cube
+material 0
+TRANS       0 10 0
+ROTAT       0 0 0
+SCALE       3 .3 3
+
+// Floor
+OBJECT 1
+cube
+material 1
+TRANS       0 0 0
+ROTAT       0 0 0
+SCALE       10 .01 10
+
+// Ceiling
+OBJECT 2
+cube
+material 1
+TRANS       0 10 0
+ROTAT       0 0 90
+SCALE       .01 10 10
+
+// Back wall
+OBJECT 3
+cube
+material 1
+TRANS       0 5 -5
+ROTAT       0 90 0
+SCALE       .01 10 10
+
+// Left wall
+OBJECT 4
+cube
+material 2
+TRANS       -5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+// Right wall
+OBJECT 5
+cube
+material 3
+TRANS       5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+// Sphere
+OBJECT 6
+sphere
+material 6
+TRANS       2 2 2
+ROTAT       0 0 0
+SCALE       3 3 3
\ No newline at end of file
diff --git a/scenes/diffuse_specular.txt b/scenes/diffuse_specular.txt
new file mode 100644
index 0000000..1451a74
--- /dev/null
+++ b/scenes/diffuse_specular.txt
@@ -0,0 +1,202 @@
+// Emissive material (light)
+MATERIAL 0
+RGB         1 1 1
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   20
+
+// Diffuse white
+MATERIAL 1
+RGB         .98 .98 .98
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse red
+MATERIAL 2
+RGB         .85 .35 .35
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse green
+MATERIAL 3
+RGB         .35 .85 .35
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse blue
+MATERIAL 4
+RGB         .35 .35 .85
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Specular blue
+MATERIAL 5
+RGB         .35 .35 .85
+SPECEX      25
+SPECRGB     .35 .35 .85
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// More specular blue
+MATERIAL 6
+RGB         .35 .35 .85
+SPECEX      100
+SPECRGB     .35 .35 .85
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Mirror
+MATERIAL 7
+RGB         .35 .35 .85
+SPECEX      0
+SPECRGB     0 0 0
+REFL        1
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Water
+MATERIAL 8
+RGB         0.98 0.98 0.98
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        1
+REFRIOR     1.2
+EMITTANCE   0
+
+// Glass
+MATERIAL 9
+RGB         0.98 0.98 0.98
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        1
+REFRIOR     1.6
+EMITTANCE   0
+
+// Diamond
+MATERIAL 10
+RGB         0.98 0.98 0.98
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        1
+REFRIOR     2.4
+EMITTANCE   0
+
+
+// Camera
+CAMERA
+RES         800 800
+FOVY        45
+ITERATIONS  5000
+DEPTH       8
+FILE        cornell
+EYE         0.0 5 10.5
+LOOKAT      0 5 0
+UP          0 1 0
+
+
+// Ceiling light
+OBJECT 0
+cube
+material 0
+TRANS       0 10 0
+ROTAT       0 0 0
+SCALE       3 .3 3
+
+// Floor
+OBJECT 1
+cube
+material 1
+TRANS       0 0 0
+ROTAT       0 0 0
+SCALE       10 .01 10
+
+// Ceiling
+OBJECT 2
+cube
+material 1
+TRANS       0 10 0
+ROTAT       0 0 90
+SCALE       .01 10 10
+
+// Back wall
+OBJECT 3
+cube
+material 1
+TRANS       0 5 -5
+ROTAT       0 90 0
+SCALE       .01 10 10
+
+// Left wall
+OBJECT 4
+cube
+material 2
+TRANS       -5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+// Right wall
+OBJECT 5
+cube
+material 3
+TRANS       5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+// Sphere
+OBJECT 6
+sphere
+material 4
+TRANS       -3.75 1 0
+ROTAT       0 0 0
+SCALE       2 2 2
+
+// Sphere
+OBJECT 7
+sphere
+material 4
+TRANS       -1.25 1 0
+ROTAT       0 0 0
+SCALE       2 2 2
+
+// Sphere
+OBJECT 8
+sphere
+material 4
+TRANS       1.25 1 0
+ROTAT       0 0 0
+SCALE       2 2 2
+
+// Sphere
+OBJECT 9
+sphere
+material 4
+TRANS       3.75 1 0
+ROTAT       0 0 0
+SCALE       2 2 2
\ No newline at end of file
diff --git a/scenes/refractive.txt b/scenes/refractive.txt
new file mode 100644
index 0000000..72706a6
--- /dev/null
+++ b/scenes/refractive.txt
@@ -0,0 +1,226 @@
+// Emissive material (light)
+MATERIAL 0
+RGB         1 1 1
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   20
+
+// Diffuse white
+MATERIAL 1
+RGB         .98 .98 .98
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse red
+MATERIAL 2
+RGB         .85 .35 .35
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse green
+MATERIAL 3
+RGB         .35 .85 .35
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse blue
+MATERIAL 4
+RGB         .35 .35 .85
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Specular blue
+MATERIAL 5
+RGB         .35 .35 .85
+SPECEX      25
+SPECRGB     .35 .35 .85
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// More specular blue
+MATERIAL 6
+RGB         .35 .35 .85
+SPECEX      100
+SPECRGB     .35 .35 .85
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Mirror
+MATERIAL 7
+RGB         .35 .35 .85
+SPECEX      0
+SPECRGB     0 0 0
+REFL        1
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Water
+MATERIAL 8
+RGB         0.98 0.98 0.98
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        1
+REFRIOR     1.2
+EMITTANCE   0
+
+// Glass
+MATERIAL 9
+RGB         0.98 0.98 0.98
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        1
+REFRIOR     1.6
+EMITTANCE   0
+
+// Diamond
+MATERIAL 10
+RGB         0.98 0.98 0.98
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        1
+REFRIOR     2.4
+EMITTANCE   0
+
+
+// Camera
+CAMERA
+RES         800 800
+FOVY        45
+ITERATIONS  5000
+DEPTH       8
+FILE        cornell
+EYE         0.0 5 10.5
+LOOKAT      0 5 0
+UP          0 1 0
+
+
+// Ceiling light
+OBJECT 0
+cube
+material 0
+TRANS       0 10 0
+ROTAT       0 0 0
+SCALE       3 .3 3
+
+// Floor
+OBJECT 1
+cube
+material 1
+TRANS       0 0 0
+ROTAT       0 0 0
+SCALE       10 .01 10
+
+// Ceiling
+OBJECT 2
+cube
+material 1
+TRANS       0 10 0
+ROTAT       0 0 90
+SCALE       .01 10 10
+
+// Back wall
+OBJECT 3
+cube
+material 1
+TRANS       0 5 -5
+ROTAT       0 90 0
+SCALE       .01 10 10
+
+// Left wall
+OBJECT 4
+cube
+material 2
+TRANS       -5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+// Right wall
+OBJECT 5
+cube
+material 3
+TRANS       5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+// Sphere
+OBJECT 6
+sphere
+material 1
+TRANS       -3.75 1 0
+ROTAT       0 0 0
+SCALE       2 2 2
+
+// Sphere
+OBJECT 7
+sphere
+material 2
+TRANS       -1.25 1 0
+ROTAT       0 0 0
+SCALE       2 2 2
+
+// Sphere
+OBJECT 8
+sphere
+material 3
+TRANS       1.25 1 0
+ROTAT       0 0 0
+SCALE       2 2 2
+
+// Sphere
+OBJECT 9
+sphere
+material 4
+TRANS       3.75 1 0
+ROTAT       0 0 0
+SCALE       2 2 2
+
+// Sphere
+OBJECT 10
+sphere
+material 8
+TRANS       -2 2 2
+ROTAT       0 0 0
+SCALE       2 2 2
+
+// Sphere
+OBJECT 11
+sphere
+material 9
+TRANS       0 2 2
+ROTAT       0 0 0
+SCALE       2 2 2
+
+// Sphere
+OBJECT 12
+sphere
+material 10
+TRANS       2 2 2
+ROTAT       0 0 0
+SCALE       2 2 2
\ No newline at end of file
diff --git a/scenes/teapot.txt b/scenes/teapot.txt
new file mode 100644
index 0000000..7f7c04b
--- /dev/null
+++ b/scenes/teapot.txt
@@ -0,0 +1,172 @@
+// Emissive material (light)
+MATERIAL 0
+RGB         1 1 1
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   20
+
+// Diffuse white
+MATERIAL 1
+RGB         .98 .98 .98
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse red
+MATERIAL 2
+RGB         .85 .35 .35
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse green
+MATERIAL 3
+RGB         .35 .85 .35
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Specular blue
+MATERIAL 4
+RGB         .35 .35 .85
+SPECEX      50
+SPECRGB     .8 .8 .8
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Specular red
+MATERIAL 5
+RGB         .85 .35 .35
+SPECEX      10
+SPECRGB     .4 .4 .4
+REFL        1
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Glass
+MATERIAL 6
+RGB         0.9 0.9 0.9
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        1
+REFRIOR     2
+EMITTANCE   0
+
+// Mirror
+MATERIAL 7
+RGB         0.9 0.9 0.9
+SPECEX      0
+SPECRGB     0 0 0
+REFL        1
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse blue
+MATERIAL 8
+RGB         .35 .35 .85
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Grey mirror
+MATERIAL 9
+RGB         0.4 0.4 0.4 
+SPECEX      30
+SPECRGB     0.98 0.98 0.98
+REFL        1
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Camera
+CAMERA
+RES         800 800
+FOVY        45
+ITERATIONS  5000
+DEPTH       8
+FILE        cornell
+EYE         0.0 5 10.5
+LOOKAT      0 5 0
+UP          0 1 0
+
+
+// Ceiling light
+OBJECT 0
+cube
+material 0
+TRANS       0 10 0
+ROTAT       0 0 0
+SCALE       3 .3 3
+
+// Floor
+OBJECT 1
+cube
+material 7
+TRANS       0 0 0
+ROTAT       0 0 0
+SCALE       10 .01 10
+
+// Ceiling
+OBJECT 2
+cube
+material 1
+TRANS       0 10 0
+ROTAT       0 0 90
+SCALE       .01 10 10
+
+// Back wall
+OBJECT 3
+cube
+material 1
+TRANS       0 5 -5
+ROTAT       0 90 0
+SCALE       .01 10 10
+
+// Left wall
+OBJECT 4
+cube
+material 2
+TRANS       -5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+// Right wall
+OBJECT 5
+cube
+material 3
+TRANS       5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+// Teapot
+OBJECT 6
+mesh
+mesh 0
+material 4
+TRANS       0 0 0
+ROTAT       0 0 0
+SCALE       0.04 0.04 0.04
+
+// Teapot
+MESH 0
+scenes/teapot.obj
\ No newline at end of file
diff --git a/scenes/teapot2.txt b/scenes/teapot2.txt
new file mode 100644
index 0000000..6558542
--- /dev/null
+++ b/scenes/teapot2.txt
@@ -0,0 +1,172 @@
+// Emissive material (light)
+MATERIAL 0
+RGB         1 1 1
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   20
+
+// Diffuse white
+MATERIAL 1
+RGB         .98 .98 .98
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse red
+MATERIAL 2
+RGB         .85 .35 .35
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse green
+MATERIAL 3
+RGB         .35 .85 .35
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Specular blue
+MATERIAL 4
+RGB         .35 .35 .85
+SPECEX      50
+SPECRGB     .8 .8 .8
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Specular red
+MATERIAL 5
+RGB         .85 .35 .35
+SPECEX      10
+SPECRGB     .4 .4 .4
+REFL        1
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Glass
+MATERIAL 6
+RGB         0.9 0.9 0.9
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        1
+REFRIOR     2
+EMITTANCE   0
+
+// Mirror
+MATERIAL 7
+RGB         0.9 0.9 0.9
+SPECEX      0
+SPECRGB     0 0 0
+REFL        1
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse blue
+MATERIAL 8
+RGB         .35 .35 .85
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Grey mirror
+MATERIAL 9
+RGB         0.4 0.4 0.4 
+SPECEX      30
+SPECRGB     0.98 0.98 0.98
+REFL        1
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Camera
+CAMERA
+RES         800 800
+FOVY        45
+ITERATIONS  10
+DEPTH       32
+FILE        cornell
+EYE         0.0 5 10.5
+LOOKAT      0 5 0
+UP          0 1 0
+
+
+// Ceiling light
+OBJECT 0
+cube
+material 0
+TRANS       0 10 0
+ROTAT       0 0 0
+SCALE       3 .3 3
+
+// Floor
+OBJECT 1
+cube
+material 1
+TRANS       0 0 0
+ROTAT       0 0 0
+SCALE       10 .01 10
+
+// Ceiling
+OBJECT 2
+cube
+material 1
+TRANS       0 10 0
+ROTAT       0 0 90
+SCALE       .01 10 10
+
+// Back wall
+OBJECT 3
+cube
+material 1
+TRANS       0 5 -5
+ROTAT       0 90 0
+SCALE       .01 10 10
+
+// Left wall
+OBJECT 4
+cube
+material 2
+TRANS       -5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+// Right wall
+OBJECT 5
+cube
+material 3
+TRANS       5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+// Teapot
+OBJECT 6
+mesh
+mesh 0
+material 4
+TRANS       0 0 0
+ROTAT       0 0 0
+SCALE       0.04 0.04 0.04
+
+// Teapot
+MESH 0
+scenes/teapot.obj
\ No newline at end of file
diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index a1cb3fb..aaf8562 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -19,5 +19,5 @@ set(SOURCE_FILES
 
 cuda_add_library(src
     ${SOURCE_FILES}
-    OPTIONS -arch=sm_20
+    OPTIONS -arch=sm_50
     )
diff --git a/src/interactions.h b/src/interactions.h
index 5ce3628..4b67fd3 100644
--- a/src/interactions.h
+++ b/src/interactions.h
@@ -2,6 +2,8 @@
 
 #include "intersections.h"
 
+#include <glm/gtx/rotate_vector.hpp>
+
 // CHECKITOUT
 /**
  * Computes a cosine-weighted random direction in a hemisphere.
@@ -41,6 +43,25 @@ glm::vec3 calculateRandomDirectionInHemisphere(
         + sin(around) * over * perpendicularDirection2;
 }
 
+__host__ __device__
+glm::vec3 calculateRandomDirectionWithSpecular(float specularExponent, glm::vec3 direction,
+glm::vec3 normal, thrust::default_random_engine &rng) {
+	thrust::uniform_real_distribution<float> u01(0, 1);
+
+	float theta = acos(pow(u01(rng), 1.0f / (1.0f + specularExponent)));
+	float phi = 2 * PI * u01(rng);
+
+	glm::vec3 randDir(cos(phi) * sin(theta), sin(phi) * sin(theta), cos(theta));
+
+	glm::vec3 newUp = glm::normalize(glm::reflect(direction, normal));
+	glm::vec3 oldUp(0, 0, 1);
+
+	glm::vec3 rotationAxis = glm::cross(oldUp, newUp);
+	float rotationAngle = acos(glm::dot(newUp, oldUp));
+
+	return glm::rotate(randDir, rotationAngle, rotationAxis);
+}
+
 /**
  * Scatter a ray with some probabilities according to the material properties.
  * For example, a diffuse surface scatters in a cosine-weighted hemisphere.
@@ -73,7 +94,42 @@ void scatterRay(
         glm::vec3 normal,
         const Material &m,
         thrust::default_random_engine &rng) {
-    // TODO: implement this.
-    // A basic implementation of pure-diffuse shading will just call the
-    // calculateRandomDirectionInHemisphere defined above.
+	Ray & ray = pathSegment.ray;
+  ray.origin = intersect;
+	thrust::uniform_real_distribution<float> u01(0, 1);
+  const float specularWeight = 0.5f;
+  const float specularWeightInverse = (specularWeight > 0.0f) ? 1.0f / specularWeight : 0.0f;
+  const float diffuseWeightInverse = 1.0f / (1 - specularWeight);
+	// Specular highlight
+  if (u01(rng) < specularWeight) {
+		ray.direction = calculateRandomDirectionWithSpecular(m.specular.exponent, ray.direction, normal, rng);
+		pathSegment.color *= m.specular.color * specularWeightInverse;
+  }
+	// Diffuse color
+  else {
+		if (m.hasReflective > 0.0f) {
+			ray.direction = glm::reflect(ray.direction, normal);
+		}
+		else if (m.hasRefractive > 0.0f) {
+			float refractionCoeff = (pathSegment.insideRefractiveObject) ? m.indexOfRefraction : (1.0f / m.indexOfRefraction);
+      // Schlick's approximation for fresnel
+      float R_0_sqrt = (m.indexOfRefraction - 1.0f) / (m.indexOfRefraction + 1.0f);
+      float R_0 = R_0_sqrt * R_0_sqrt;
+      float R = R_0 + (1.0f - R_0) * glm::pow(1.0f - fabs(glm::dot(normal, ray.direction)), 5.0f);
+      if (u01(rng) < R) {
+        ray.direction = glm::normalize(glm::reflect(ray.direction, normal));
+      }
+      else {
+        ray.direction = glm::normalize(glm::refract(ray.direction, normal, refractionCoeff));
+      }
+			if (glm::dot(ray.direction, normal) < 0.0f) {
+				pathSegment.insideRefractiveObject = !pathSegment.insideRefractiveObject;
+			}
+		}
+		else {
+			ray.direction = calculateRandomDirectionInHemisphere(normal, rng);
+			pathSegment.color *= glm::dot(ray.direction, normal);
+		}
+		pathSegment.color *= m.color * diffuseWeightInverse;
+  }
 }
diff --git a/src/intersections.h b/src/intersections.h
index 6f23872..52b5424 100644
--- a/src/intersections.h
+++ b/src/intersections.h
@@ -36,6 +36,33 @@ __host__ __device__ glm::vec3 multiplyMV(glm::mat4 m, glm::vec4 v) {
 }
 
 // CHECKITOUT
+
+__host__ __device__ bool orientedBoxIntersection(glm::vec3 boxMin, glm::vec3 boxMax, Ray r,
+  float &tmin, float &tmax, glm::vec3 & tmin_n, glm::vec3 & tmax_n) {
+  tmin = -1e38f;
+  tmax = 1e38f;
+  for (int xyz = 0; xyz < 3; ++xyz) {
+    float qdxyz = r.direction[xyz];
+    /*if (glm::abs(qdxyz) > 0.00001f)*/ {
+      float t1 = (boxMin[xyz] - r.origin[xyz]) / qdxyz;
+      float t2 = (boxMax[xyz] - r.origin[xyz]) / qdxyz;
+      float ta = glm::min(t1, t2);
+      float tb = glm::max(t1, t2);
+      glm::vec3 n;
+      n[xyz] = t2 < t1 ? +1 : -1;
+      if (ta > 0 && ta > tmin) {
+        tmin = ta;
+        tmin_n = n;
+      }
+      if (tb < tmax) {
+        tmax = tb;
+        tmax_n = n;
+      }
+    }
+  }
+  return tmax >= tmin && tmax > 0;
+}
+
 /**
  * Test intersection between a ray and a transformed cube. Untransformed,
  * the cube ranges from -0.5 to 0.5 in each axis and is centered at the origin.
@@ -55,25 +82,7 @@ __host__ __device__ float boxIntersectionTest(Geom box, Ray r,
     float tmax = 1e38f;
     glm::vec3 tmin_n;
     glm::vec3 tmax_n;
-    for (int xyz = 0; xyz < 3; ++xyz) {
-        float qdxyz = q.direction[xyz];
-        /*if (glm::abs(qdxyz) > 0.00001f)*/ {
-            float t1 = (-0.5f - q.origin[xyz]) / qdxyz;
-            float t2 = (+0.5f - q.origin[xyz]) / qdxyz;
-            float ta = glm::min(t1, t2);
-            float tb = glm::max(t1, t2);
-            glm::vec3 n;
-            n[xyz] = t2 < t1 ? +1 : -1;
-            if (ta > 0 && ta > tmin) {
-                tmin = ta;
-                tmin_n = n;
-            }
-            if (tb < tmax) {
-                tmax = tb;
-                tmax_n = n;
-            }
-        }
-    }
+    orientedBoxIntersection(glm::vec3(-0.5f, -0.5f, -0.5f), glm::vec3(0.5f, 0.5f, 0.5f), q, tmin, tmax, tmin_n, tmax_n);
 
     if (tmax >= tmin && tmax > 0) {
         outside = true;
@@ -142,3 +151,52 @@ __host__ __device__ float sphereIntersectionTest(Geom sphere, Ray r,
 
     return glm::length(r.origin - intersectionPoint);
 }
+
+/**
+* Test intersection between a ray and a mesh
+*
+* @param intersectionPoint  Output parameter for point of intersection.
+* @param normal             Output parameter for surface normal.
+* @param outside            Output param for whether the ray came from outside.
+* @return                   Ray parameter `t` value. -1 if no intersection.
+*/
+__host__ __device__ float meshIntersectionTest(Geom geom, Mesh * meshes, Triangle * triangles, Ray r,
+  glm::vec3 &intersectionPoint, glm::vec3 &normal, bool &outside, int gridIndex = -1, bool flip = false) {
+	Ray rt;
+  rt.origin = multiplyMV(geom.inverseTransform, glm::vec4(r.origin, 1.0f));
+  rt.direction = glm::normalize(multiplyMV(geom.inverseTransform, glm::vec4(r.direction, 0.0f)));
+  Mesh & mesh = meshes[geom.meshid];
+
+  int triangleStart, triangleEnd;
+  if (gridIndex == -1) {
+    triangleStart = mesh.triangleStart;
+    triangleEnd = mesh.triangleEnd;
+  }
+  else {
+    triangleStart = mesh.gridIdx[gridIndex].start;
+    triangleEnd = mesh.gridIdx[gridIndex].end;
+  }
+
+	float t_min = FLT_MAX;
+  for (int i = triangleStart; i < triangleEnd; i++) {
+    glm::vec3 result;
+    bool hasIntersect = glm::intersectRayTriangle(rt.origin, rt.direction, triangles[i].vertices[0], 
+      triangles[i].vertices[flip ? 2 : 1], triangles[i].vertices[flip ? 1 : 2], result);
+
+    glm::vec3 intersect = getPointOnRay(rt, result.z);
+    float t = glm::length(rt.origin - intersect);
+		if (hasIntersect && t > 1e-3 && t_min > t) {
+			t_min = t;
+      intersectionPoint = multiplyMV(geom.transform, glm::vec4(intersect, 1.0f));
+			normal = glm::normalize(multiplyMV(geom.invTranspose, glm::vec4(triangles[i].normal, 0.0f)));
+			outside = glm::dot(rt.direction, normal) < 0.0f;
+		}
+	}
+
+	if (t_min == FLT_MAX) {
+		return -1.0f;
+	}
+	else {
+		return glm::length(r.origin - intersectionPoint);
+	}
+}
diff --git a/src/main.cpp b/src/main.cpp
index fe8e85e..0526e8b 100644
--- a/src/main.cpp
+++ b/src/main.cpp
@@ -1,6 +1,7 @@
 #include "main.h"
 #include "preview.h"
 #include <cstring>
+#include <chrono>
 
 static std::string startTimeString;
 
@@ -122,7 +123,10 @@ void runCuda() {
     // Map OpenGL buffer object for writing from CUDA on a single GPU
     // No data is moved (Win & Linux). When mapped to CUDA, OpenGL should not use this buffer
 
+		static std::chrono::time_point<std::chrono::high_resolution_clock> start;
+
     if (iteration == 0) {
+				start = std::chrono::high_resolution_clock::now();
         pathtraceFree();
         pathtraceInit(scene);
     }
@@ -139,10 +143,13 @@ void runCuda() {
         // unmap buffer object
         cudaGLUnmapBufferObject(pbo);
     } else {
-        saveImage();
-        pathtraceFree();
-        cudaDeviceReset();
-        exit(EXIT_SUCCESS);
+			std::chrono::time_point<std::chrono::high_resolution_clock> end = std::chrono::high_resolution_clock::now();
+			double seconds = std::chrono::duration_cast<std::chrono::microseconds>(end - start).count() * 1e-6;
+      printf("Took %lf seconds for %d iterations\n", seconds, renderState->iterations);
+      saveImage();
+      pathtraceFree();
+      cudaDeviceReset();
+      exit(EXIT_SUCCESS);
     }
 }
 
diff --git a/src/pathtrace.cu b/src/pathtrace.cu
index c1ec122..0593500 100644
--- a/src/pathtrace.cu
+++ b/src/pathtrace.cu
@@ -1,6 +1,7 @@
 #include <cstdio>
 #include <cuda.h>
 #include <cmath>
+#include <thrust/device_ptr.h>
 #include <thrust/execution_policy.h>
 #include <thrust/random.h>
 #include <thrust/remove.h>
@@ -15,100 +16,154 @@
 #include "interactions.h"
 
 #define ERRORCHECK 1
+#define GROUPBYMAT 0
+#define CACHEFIRSTBOUNCE 1
+#define CALCMESHSEPARATELY 1
+#define ANTIALIAS_SAMPLE_SIDE 2
+
 
 #define FILENAME (strrchr(__FILE__, '/') ? strrchr(__FILE__, '/') + 1 : __FILE__)
 #define checkCUDAError(msg) checkCUDAErrorFn(msg, FILENAME, __LINE__)
 void checkCUDAErrorFn(const char *msg, const char *file, int line) {
 #if ERRORCHECK
-    cudaDeviceSynchronize();
-    cudaError_t err = cudaGetLastError();
-    if (cudaSuccess == err) {
-        return;
-    }
+  cudaDeviceSynchronize();
+  cudaError_t err = cudaGetLastError();
+  if (cudaSuccess == err) {
+    return;
+  }
 
-    fprintf(stderr, "CUDA error");
-    if (file) {
-        fprintf(stderr, " (%s:%d)", file, line);
-    }
-    fprintf(stderr, ": %s: %s\n", msg, cudaGetErrorString(err));
+  fprintf(stderr, "CUDA error");
+  if (file) {
+    fprintf(stderr, " (%s:%d)", file, line);
+  }
+  fprintf(stderr, ": %s: %s\n", msg, cudaGetErrorString(err));
 #  ifdef _WIN32
-    getchar();
+  getchar();
 #  endif
-    exit(EXIT_FAILURE);
+  exit(EXIT_FAILURE);
 #endif
 }
 
 __host__ __device__
 thrust::default_random_engine makeSeededRandomEngine(int iter, int index, int depth) {
-    int h = utilhash((1 << 31) | (depth << 22) | iter) ^ utilhash(index);
-    return thrust::default_random_engine(h);
+  int h = utilhash((1 << 31) | (depth << 22) | iter) ^ utilhash(index);
+  return thrust::default_random_engine(h);
 }
 
 //Kernel that writes the image to the OpenGL PBO directly.
 __global__ void sendImageToPBO(uchar4* pbo, glm::ivec2 resolution,
-        int iter, glm::vec3* image) {
-    int x = (blockIdx.x * blockDim.x) + threadIdx.x;
-    int y = (blockIdx.y * blockDim.y) + threadIdx.y;
-
-    if (x < resolution.x && y < resolution.y) {
-        int index = x + (y * resolution.x);
-        glm::vec3 pix = image[index];
-
-        glm::ivec3 color;
-        color.x = glm::clamp((int) (pix.x / iter * 255.0), 0, 255);
-        color.y = glm::clamp((int) (pix.y / iter * 255.0), 0, 255);
-        color.z = glm::clamp((int) (pix.z / iter * 255.0), 0, 255);
-
-        // Each thread writes one pixel location in the texture (textel)
-        pbo[index].w = 0;
-        pbo[index].x = color.x;
-        pbo[index].y = color.y;
-        pbo[index].z = color.z;
-    }
+  int iter, glm::vec3* image) {
+  int x = (blockIdx.x * blockDim.x) + threadIdx.x;
+  int y = (blockIdx.y * blockDim.y) + threadIdx.y;
+
+  if (x < resolution.x && y < resolution.y) {
+    int index = x + (y * resolution.x);
+    glm::vec3 pix = image[index];
+
+    glm::ivec3 color;
+    color.x = glm::clamp((int)(pix.x / iter * 255.0), 0, 255);
+    color.y = glm::clamp((int)(pix.y / iter * 255.0), 0, 255);
+    color.z = glm::clamp((int)(pix.z / iter * 255.0), 0, 255);
+
+    // Each thread writes one pixel location in the texture (textel)
+    pbo[index].w = 0;
+    pbo[index].x = color.x;
+    pbo[index].y = color.y;
+    pbo[index].z = color.z;
+  }
 }
 
 static Scene * hst_scene = NULL;
 static glm::vec3 * dev_image = NULL;
+static glm::vec3 * dev_final_image = NULL;
 static Geom * dev_geoms = NULL;
 static Material * dev_materials = NULL;
 static PathSegment * dev_paths = NULL;
-static ShadeableIntersection * dev_intersections = NULL;
 // TODO: static variables for device memory, any extra info you need, etc
-// ...
+static PathSegment * dev_cached_paths = NULL;
+static Geom * dev_meshgeoms = NULL;
+static Mesh * dev_meshes = NULL;
+static Triangle * dev_triangles = NULL;
 
-void pathtraceInit(Scene *scene) {
-    hst_scene = scene;
-    const Camera &cam = hst_scene->state.camera;
-    const int pixelcount = cam.resolution.x * cam.resolution.y;
-
-    cudaMalloc(&dev_image, pixelcount * sizeof(glm::vec3));
-    cudaMemset(dev_image, 0, pixelcount * sizeof(glm::vec3));
+static glm::ivec3 * dev_path_mesh_intersections = NULL;
+static ShadeableIntersection * dev_path_mesh_intersection_dists = NULL;
+static glm::ivec3 * dev_pm_intersection_out = NULL;
+static ShadeableIntersection * dev_pm_intersection_dists_out = NULL;
+int numGeoms = 0;
 
-  	cudaMalloc(&dev_paths, pixelcount * sizeof(PathSegment));
+void pathtraceInit(Scene *scene) {
+  hst_scene = scene;
+  const Camera &cam = hst_scene->state.camera;
+#if ANTIALIAS_SAMPLE_SIDE == 0
+  const int pixelcount = cam.resolution.x * cam.resolution.y;
+#else
+  const int actual_pixelcount = cam.resolution.x * cam.resolution.y;
+  const int pixelcount = actual_pixelcount * ANTIALIAS_SAMPLE_SIDE * ANTIALIAS_SAMPLE_SIDE;
+#endif
 
-  	cudaMalloc(&dev_geoms, scene->geoms.size() * sizeof(Geom));
-  	cudaMemcpy(dev_geoms, scene->geoms.data(), scene->geoms.size() * sizeof(Geom), cudaMemcpyHostToDevice);
+  cudaMalloc(&dev_image, pixelcount * sizeof(glm::vec3));
+  cudaMemset(dev_image, 0, pixelcount * sizeof(glm::vec3));
+#if ANTIALIAS_SAMPLE_SIDE != 0
+  cudaMalloc(&dev_final_image, actual_pixelcount * sizeof(glm::vec3));
+#else
+  dev_final_image = dev_image;
+#endif
+  cudaMemset(dev_image, 0, pixelcount * sizeof(glm::vec3));
+  cudaMalloc(&dev_paths, pixelcount * sizeof(PathSegment));
+
+#if CALCMESHSEPARATELY == 1
+  cudaMalloc(&dev_geoms, scene->geoms.size() * sizeof(Geom));
+  cudaMemcpy(dev_geoms, scene->geoms.data(), scene->geoms.size() * sizeof(Geom), cudaMemcpyHostToDevice);
+  cudaMalloc(&dev_meshgeoms, scene->meshGeoms.size() * sizeof(Geom));
+  cudaMemcpy(dev_meshgeoms, scene->meshGeoms.data(), scene->meshGeoms.size() * sizeof(Geom), cudaMemcpyHostToDevice);
+  cudaMalloc(&dev_path_mesh_intersections, GRID_FULL * pixelcount * scene->meshGeoms.size() * sizeof(glm::ivec3));
+  cudaMalloc(&dev_path_mesh_intersection_dists, GRID_FULL * pixelcount * scene->meshGeoms.size() * sizeof(ShadeableIntersection));
+  cudaMalloc(&dev_pm_intersection_out, pixelcount * sizeof(glm::ivec3));
+  cudaMalloc(&dev_pm_intersection_dists_out, pixelcount * sizeof(ShadeableIntersection));
+  numGeoms = scene->geoms.size();
+#else
+  numGeoms = scene->geoms.size() + scene->meshGeoms.size();
+  cudaMalloc(&dev_geoms, numGeoms * sizeof(Geom));
+  cudaMemcpy(dev_geoms, scene->geoms.data(), scene->geoms.size() * sizeof(Geom), cudaMemcpyHostToDevice);
+  cudaMemcpy(dev_geoms + scene->geoms.size(), scene->meshGeoms.data(), scene->meshGeoms.size() * sizeof(Geom), cudaMemcpyHostToDevice);
+#endif
 
-  	cudaMalloc(&dev_materials, scene->materials.size() * sizeof(Material));
-  	cudaMemcpy(dev_materials, scene->materials.data(), scene->materials.size() * sizeof(Material), cudaMemcpyHostToDevice);
 
-  	cudaMalloc(&dev_intersections, pixelcount * sizeof(ShadeableIntersection));
-  	cudaMemset(dev_intersections, 0, pixelcount * sizeof(ShadeableIntersection));
+  cudaMalloc(&dev_materials, scene->materials.size() * sizeof(Material));
+  cudaMemcpy(dev_materials, scene->materials.data(), scene->materials.size() * sizeof(Material), cudaMemcpyHostToDevice);
 
-    // TODO: initialize any extra device memeory you need
+  // TODO: initialize any extra device memeory you need
+  cudaMalloc(&dev_cached_paths, pixelcount * sizeof(PathSegment));
+  cudaMalloc(&dev_meshes, scene->meshes.size() * sizeof(Mesh));
+  cudaMemcpy(dev_meshes, scene->meshes.data(), scene->meshes.size() * sizeof(Mesh), cudaMemcpyHostToDevice);
+  cudaMalloc(&dev_triangles, scene->triangles.size() * sizeof(Triangle));
+  cudaMemcpy(dev_triangles, scene->triangles.data(), scene->triangles.size() * sizeof(Triangle), cudaMemcpyHostToDevice);
 
-    checkCUDAError("pathtraceInit");
+  checkCUDAError("pathtraceInit");
 }
 
 void pathtraceFree() {
-    cudaFree(dev_image);  // no-op if dev_image is null
-  	cudaFree(dev_paths);
-  	cudaFree(dev_geoms);
-  	cudaFree(dev_materials);
-  	cudaFree(dev_intersections);
-    // TODO: clean up any extra device memory you created
-
-    checkCUDAError("pathtraceFree");
+  cudaFree(dev_image);  // no-op if dev_image is null
+  cudaFree(dev_paths);
+  cudaFree(dev_geoms);
+  cudaFree(dev_materials);
+  // TODO: clean up any extra device memory you created
+  cudaFree(dev_cached_paths);
+  cudaFree(dev_meshes);
+  cudaFree(dev_triangles);
+#if CALCMESHSEPARATELY == 1
+  cudaFree(dev_meshgeoms);
+  cudaFree(dev_path_mesh_intersections);
+  cudaFree(dev_path_mesh_intersection_dists);
+  cudaFree(dev_pm_intersection_out);
+  cudaFree(dev_pm_intersection_dists_out);
+#endif
+
+#if ANTIALIAS_SAMPLE_SIDE != 0
+  cudaFree(dev_final_image);
+#endif
+
+  checkCUDAError("pathtraceFree");
 }
 
 /**
@@ -121,273 +176,506 @@ void pathtraceFree() {
 */
 __global__ void generateRayFromCamera(Camera cam, int iter, int traceDepth, PathSegment* pathSegments)
 {
-	int x = (blockIdx.x * blockDim.x) + threadIdx.x;
-	int y = (blockIdx.y * blockDim.y) + threadIdx.y;
+  int x = (blockIdx.x * blockDim.x) + threadIdx.x;
+  int y = (blockIdx.y * blockDim.y) + threadIdx.y;
 
-	if (x < cam.resolution.x && y < cam.resolution.y) {
-		int index = x + (y * cam.resolution.x);
-		PathSegment & segment = pathSegments[index];
+  if (x < cam.resolution.x && y < cam.resolution.y) {
+#if ANTIALIAS_SAMPLE_SIDE == 0
+    int index = x + (y * cam.resolution.x);
+    PathSegment & segment = pathSegments[index];
 
-		segment.ray.origin = cam.position;
+    segment.ray.origin = cam.position;
     segment.color = glm::vec3(1.0f, 1.0f, 1.0f);
 
-		// TODO: implement antialiasing by jittering the ray
-		segment.ray.direction = glm::normalize(cam.view
-			- cam.right * cam.pixelLength.x * ((float)x - (float)cam.resolution.x * 0.5f)
-			- cam.up * cam.pixelLength.y * ((float)y - (float)cam.resolution.y * 0.5f)
-			);
-
-		segment.pixelIndex = index;
-		segment.remainingBounces = traceDepth;
-	}
+    segment.ray.direction = glm::normalize(cam.view
+      - cam.right * cam.pixelLength.x * ((float)x - (float)cam.resolution.x * 0.5f)
+      - cam.up * cam.pixelLength.y * ((float)y - (float)cam.resolution.y * 0.5f)
+      );
+
+    segment.pixelIndex = index;
+    segment.remainingBounces = traceDepth;
+    segment.insideRefractiveObject = false;
+#else
+    float subpixel_side = 1.0f / (float)ANTIALIAS_SAMPLE_SIDE;
+    thrust::default_random_engine rng = makeSeededRandomEngine(iter,  x + y * cam.resolution.x, traceDepth);
+    thrust::uniform_real_distribution<float> u(0, subpixel_side);
+    for (int i = 0; i < ANTIALIAS_SAMPLE_SIDE; i++) {
+      for (int j = 0; j < ANTIALIAS_SAMPLE_SIDE; j++) {
+        int index = (x + (y * cam.resolution.x)) * ANTIALIAS_SAMPLE_SIDE * ANTIALIAS_SAMPLE_SIDE + ANTIALIAS_SAMPLE_SIDE * j + i;
+        PathSegment & segment = pathSegments[index];
+        segment.ray.origin = cam.position;
+        segment.color = glm::vec3(1.0f, 1.0f, 1.0f);
+
+        segment.ray.direction = glm::normalize(cam.view 
+          - cam.right * cam.pixelLength.x * ((float)x - 0.5f + i * subpixel_side + u(rng) - (float)cam.resolution.x * 0.5f)
+          - cam.up * cam.pixelLength.y * ((float)y - 0.5f + j * subpixel_side + u(rng) - (float)cam.resolution.y * 0.5f)
+          );
+
+        segment.pixelIndex = index;
+        segment.remainingBounces = traceDepth;
+        segment.insideRefractiveObject = false;
+      }
+    }
+#endif
+  }
 }
 
-// TODO:
-// computeIntersections handles generating ray intersections ONLY.
-// Generating new rays is handled in your shader(s).
-// Feel free to modify the code below.
-__global__ void computeIntersections(
-	int depth
-	, int num_paths
-	, PathSegment * pathSegments
-	, Geom * geoms
-	, int geoms_size
-	, ShadeableIntersection * intersections
-	)
+__global__ void pathTraceSphereBox(
+  int depth
+  , int num_paths
+  , PathSegment * pathSegments
+  , Geom * geoms
+  , int geoms_size
+  , Mesh * meshes
+  , Triangle * triangles)
 {
-	int path_index = blockIdx.x * blockDim.x + threadIdx.x;
-
-	if (path_index < num_paths)
-	{
-		PathSegment pathSegment = pathSegments[path_index];
-
-		float t;
-		glm::vec3 intersect_point;
-		glm::vec3 normal;
-		float t_min = FLT_MAX;
-		int hit_geom_index = -1;
-		bool outside = true;
-
-		glm::vec3 tmp_intersect;
-		glm::vec3 tmp_normal;
-
-		// naive parse through global geoms
-
-		for (int i = 0; i < geoms_size; i++)
-		{
-			Geom & geom = geoms[i];
-
-			if (geom.type == CUBE)
-			{
-				t = boxIntersectionTest(geom, pathSegment.ray, tmp_intersect, tmp_normal, outside);
-			}
-			else if (geom.type == SPHERE)
-			{
-				t = sphereIntersectionTest(geom, pathSegment.ray, tmp_intersect, tmp_normal, outside);
-			}
-			// TODO: add more intersection tests here... triangle? metaball? CSG?
-
-			// Compute the minimum t from the intersection tests to determine what
-			// scene geometry object was hit first.
-			if (t > 0.0f && t_min > t)
-			{
-				t_min = t;
-				hit_geom_index = i;
-				intersect_point = tmp_intersect;
-				normal = tmp_normal;
-			}
-		}
-
-		if (hit_geom_index == -1)
-		{
-			intersections[path_index].t = -1.0f;
-		}
-		else
-		{
-			//The ray hits something
-			intersections[path_index].t = t_min;
-			intersections[path_index].materialId = geoms[hit_geom_index].materialid;
-			intersections[path_index].surfaceNormal = normal;
-		}
-	}
+  int path_index = blockIdx.x * blockDim.x + threadIdx.x;
+
+  if (pathSegments[path_index].remainingBounces == 0) {
+    return;
+  }
+
+  if (path_index < num_paths)
+  {
+    PathSegment pathSegment = pathSegments[path_index];
+
+    float t;
+    glm::vec3 intersect_point;
+    glm::vec3 normal;
+    float t_min = FLT_MAX;
+    int hit_geom_index = -1;
+    bool outside = true;
+
+    glm::vec3 tmp_intersect;
+    glm::vec3 tmp_normal;
+
+    // naive parse through global geoms
+
+    for (int i = 0; i < geoms_size; i++)
+    {
+      Geom & geom = geoms[i];
+
+      if (geom.type == CUBE)
+      {
+        t = boxIntersectionTest(geom, pathSegment.ray, tmp_intersect, tmp_normal, outside);
+      }
+      else if (geom.type == SPHERE)
+      {
+        t = sphereIntersectionTest(geom, pathSegment.ray, tmp_intersect, tmp_normal, outside);
+      }
+#if CALCMESHSEPARATELY == 0
+      else if (geom.type == MESH)
+      {
+        t = meshIntersectionTest(geom, meshes, triangles, pathSegment.ray, tmp_intersect, tmp_normal, outside);
+      }
+#endif
+      // TODO: add more intersection tests here... triangle? metaball? CSG?
+      // Compute the minimum t from the intersection tests to determine what
+      // scene geometry object was hit first.
+      if (t > 1e-3 && t_min > t)
+      {
+        t_min = t;
+        hit_geom_index = i;
+        intersect_point = tmp_intersect;
+        normal = tmp_normal;
+      }
+    }
+
+    if (hit_geom_index == -1)
+    {
+      pathSegments[path_index].intersection.t = -1.0f;
+    }
+    else
+    {
+      //The ray hits something
+      pathSegments[path_index].intersection.t = t_min;
+      pathSegments[path_index].intersection.materialId = geoms[hit_geom_index].materialid;
+      pathSegments[path_index].intersection.surfaceNormal = normal;
+    }
+  }
 }
 
-// LOOK: "fake" shader demonstrating what you might do with the info in
-// a ShadeableIntersection, as well as how to use thrust's random number
-// generator. Observe that since the thrust random number generator basically
-// adds "noise" to the iteration, the image should start off noisy and get
-// cleaner as more iterations are computed.
-//
-// Note that this shader does NOT do a BSDF evaluation!
-// Your shaders should handle that - this can allow techniques such as
-// bump mapping.
-__global__ void shadeFakeMaterial (
-  int iter
-  , int num_paths
-	, ShadeableIntersection * shadeableIntersections
-	, PathSegment * pathSegments
-	, Material * materials
-	)
-{
+__global__ void simpleBSDFShader(int iter, int num_paths, PathSegment * pathSegments, Material * materials) {
   int idx = blockIdx.x * blockDim.x + threadIdx.x;
-  if (idx < num_paths)
-  {
-    ShadeableIntersection intersection = shadeableIntersections[idx];
-    if (intersection.t > 0.0f) { // if the intersection exists...
-      // Set up the RNG
-      // LOOK: this is how you use thrust's RNG! Please look at
-      // makeSeededRandomEngine as well.
-      thrust::default_random_engine rng = makeSeededRandomEngine(iter, idx, 0);
-      thrust::uniform_real_distribution<float> u01(0, 1);
+  if (idx >= num_paths) {
+    return;
+  }
+  ShadeableIntersection & intersection = pathSegments[idx].intersection;
+  if (pathSegments[idx].remainingBounces >= 0) {
+    // If intersection exists
+    if (intersection.t > 0.0f) {
 
       Material material = materials[intersection.materialId];
-      glm::vec3 materialColor = material.color;
-
-      // If the material indicates that the object was a light, "light" the ray
+      // Hit a light
       if (material.emittance > 0.0f) {
-        pathSegments[idx].color *= (materialColor * material.emittance);
+        pathSegments[idx].color *= material.emittance * material.color;
+        pathSegments[idx].remainingBounces = -1;
       }
-      // Otherwise, do some pseudo-lighting computation. This is actually more
-      // like what you would expect from shading in a rasterizer like OpenGL.
-      // TODO: replace this! you should be able to start with basically a one-liner
+      // Bouncing off a nonlight
       else {
-        float lightTerm = glm::dot(intersection.surfaceNormal, glm::vec3(0.0f, 1.0f, 0.0f));
-        pathSegments[idx].color *= (materialColor * lightTerm) * 0.3f + ((1.0f - intersection.t * 0.02f) * materialColor) * 0.7f;
-        pathSegments[idx].color *= u01(rng); // apply some noise because why not
+        scatterRay(
+          pathSegments[idx],
+          pathSegments[idx].ray.origin + intersection.t * pathSegments[idx].ray.direction,
+          intersection.surfaceNormal,
+          material,
+          makeSeededRandomEngine(iter, idx, 0)
+          );
+        pathSegments[idx].remainingBounces--;
+        if (pathSegments[idx].remainingBounces == -1) {
+          pathSegments[idx].color = glm::vec3(0.0f);
+        }
       }
-    // If there was no intersection, color the ray black.
-    // Lots of renderers use 4 channel color, RGBA, where A = alpha, often
-    // used for opacity, in which case they can indicate "no opacity".
-    // This can be useful for post-processing and image compositing.
-    } else {
+    }
+    else {
       pathSegments[idx].color = glm::vec3(0.0f);
+      pathSegments[idx].remainingBounces = -1;
     }
   }
 }
 
-// Add the current iteration's output to the overall image
-__global__ void finalGather(int nPaths, glm::vec3 * image, PathSegment * iterationPaths)
-{
-	int index = (blockIdx.x * blockDim.x) + threadIdx.x;
+// Color using only the parts with no bounces
+__global__ void partialGather(int nPaths, glm::vec3 * image, PathSegment * iterationPaths) {
+  int index = (blockIdx.x * blockDim.x) + threadIdx.x;
 
-	if (index < nPaths)
-	{
-		PathSegment iterationPath = iterationPaths[index];
-		image[iterationPath.pixelIndex] += iterationPath.color;
-	}
+  if (index < nPaths)
+  {
+    PathSegment iterationPath = iterationPaths[index];
+    if (iterationPath.remainingBounces < 0) {
+      image[iterationPath.pixelIndex] += iterationPath.color;
+    }
+  }
 }
 
+// Color using only the parts with no bounces
+__global__ void kernAntialiasGather(int pixelcount, glm::vec3 * final_image, glm::vec3 * image) {
+  int index = (blockIdx.x * blockDim.x) + threadIdx.x;
+
+  if (index < pixelcount)
+  {
+    int subpixel_count = ANTIALIAS_SAMPLE_SIDE * ANTIALIAS_SAMPLE_SIDE;
+    int base_index = index * subpixel_count;
+    glm::vec3 sumColors(0.0f, 0.0f, 0.0f);
+    for (int i = 0; i < subpixel_count; i++) {
+      sumColors += image[base_index + i];
+    }
+    final_image[index] = sumColors / (float)subpixel_count;
+  }
+}
+
+
+__global__ void kernCalculateMeshBoundingBoxIntersections(int nPaths, int nMeshes, PathSegment * iterationPaths,
+  Geom * meshgeoms, Mesh * meshes, glm::ivec3 * intersections) {
+  int pathIndex = (blockIdx.x * blockDim.x) + threadIdx.x;
+  int meshIndex = (blockIdx.y * blockDim.y) + threadIdx.y;
+  int gridIndex = (blockIdx.z * blockDim.z) + threadIdx.z;
+  if (pathIndex < nPaths && meshIndex < nMeshes && gridIndex < GRID_FULL) {
+    int idx = pathIndex * nMeshes * GRID_FULL + meshIndex * GRID_FULL + gridIndex;
+    Geom & meshGeom = meshgeoms[meshIndex];
+    Mesh & mesh = meshes[meshGeom.meshid];
+    PathSegment & path = iterationPaths[pathIndex];
+    Ray r = path.ray;
+
+    glm::vec3 gridSize = (mesh.boxMax - mesh.boxMin) / (float)GRID_SIZE;
+    glm::vec3 gridMin(
+      gridSize.x * (gridIndex % GRID_SIZE),
+      gridSize.y * ((gridIndex / GRID_SIZE) % GRID_SIZE),
+      gridSize.z * (gridIndex / (GRID_SIZE * GRID_SIZE)));
+    gridMin += mesh.boxMin;
+    glm::vec3 gridMax = gridMin + gridSize;
+
+    glm::vec3 ro = multiplyMV(meshGeom.inverseTransform, glm::vec4(r.origin, 1.0f));
+    glm::vec3 rd = glm::normalize(multiplyMV(meshGeom.inverseTransform, glm::vec4(r.direction, 0.0f)));
+    Ray rt;
+    rt.origin = ro;
+    rt.direction = rd;
+
+    float tmin, tmax;
+    glm::vec3 tmin_n, tmax_n;
+    if (orientedBoxIntersection(gridMin, gridMax, rt, tmin, tmax, tmin_n, tmax_n)) {
+      intersections[idx] = glm::ivec3(pathIndex, meshIndex, gridIndex);
+    }
+    else {
+      intersections[idx] = glm::ivec3(-1, -1, -1);
+    }
+  }
+}
+
+__global__ void kernPathTraceMesh(int nIntersections, PathSegment * iterationPaths,
+  Geom * meshgeoms, Mesh * meshes, glm::ivec3 * intersections, Triangle * triangles,
+  ShadeableIntersection * intersection_out) {
+  int index = (blockIdx.x * blockDim.x) + threadIdx.x;
+  if (index < nIntersections) {
+    glm::vec3 tmp_intersect;
+    glm::ivec3 idx = intersections[index];
+    bool outside;
+    intersection_out[index].t = meshIntersectionTest(meshgeoms[idx.y], meshes,
+      triangles, iterationPaths[idx.x].ray, 
+      tmp_intersect, intersection_out[index].surfaceNormal, outside, idx.z, 
+      iterationPaths[idx.x].insideRefractiveObject);
+    intersection_out[index].materialId = meshgeoms[idx.y].materialid;
+  }
+}
+
+__global__ void kernTakeMeshIntersection(int nIntersections, PathSegment * iterationPaths,
+  glm::ivec3 * intersectionKeys, ShadeableIntersection * intersectionValues) {
+  int index = (blockIdx.x * blockDim.x) + threadIdx.x;
+  if (index < nIntersections) {
+    ShadeableIntersection & meshIntersection = intersectionValues[index];
+    if (meshIntersection.t > EPSILON) {
+      int pathIndex = intersectionKeys[index].x;
+      ShadeableIntersection & intersection = iterationPaths[pathIndex].intersection;
+      ShadeableIntersection & meshIntersection = intersectionValues[index];
+      if (intersection.t < EPSILON || intersection.t > meshIntersection.t) {
+        intersection = meshIntersection;
+      }
+    }
+  }
+}
+
+struct HasNoBounces {
+  __host__ __device__ bool operator() (const PathSegment & path) {
+    return path.remainingBounces < 0;
+  }
+};
+
+struct SortByMaterial {
+  __host__ __device__ bool operator() (const PathSegment & p1, const PathSegment & p2) {
+    return p1.intersection.materialId < p2.intersection.materialId;
+  }
+};
+
+struct IsNonintersection {
+  __host__ __device__ bool operator() (const glm::ivec3 v) {
+    return v.x == -1 && v.y == -1 && v.z == -1;
+  }
+};
+
+struct TakeMinIntersection {
+  __host__ __device__ ShadeableIntersection operator() (ShadeableIntersection i1, ShadeableIntersection i2) {
+    if (i1.t < EPSILON) {
+      return i2;
+    }
+    if (i2.t < EPSILON) {
+      return i1;
+    }
+    return (i1.t < i2.t) ? i1 : i2;
+  }
+};
+
+struct ComparePathKey {
+  __host__ __device__ bool operator() (const glm::ivec3 v1, const glm::ivec3 v2) {
+    return v1.x == v2.x;
+  }
+};
+
 /**
  * Wrapper for the __global__ call that sets up the kernel calls and does a ton
  * of memory management
  */
 void pathtrace(uchar4 *pbo, int frame, int iter) {
-    const int traceDepth = hst_scene->state.traceDepth;
-    const Camera &cam = hst_scene->state.camera;
-    const int pixelcount = cam.resolution.x * cam.resolution.y;
-
-	// 2D block for generating ray from camera
-    const dim3 blockSize2d(8, 8);
-    const dim3 blocksPerGrid2d(
-            (cam.resolution.x + blockSize2d.x - 1) / blockSize2d.x,
-            (cam.resolution.y + blockSize2d.y - 1) / blockSize2d.y);
-
-	// 1D block for path tracing
-	const int blockSize1d = 128;
-
-    ///////////////////////////////////////////////////////////////////////////
-
-    // Recap:
-    // * Initialize array of path rays (using rays that come out of the camera)
-    //   * You can pass the Camera object to that kernel.
-    //   * Each path ray must carry at minimum a (ray, color) pair,
-    //   * where color starts as the multiplicative identity, white = (1, 1, 1).
-    //   * This has already been done for you.
-    // * For each depth:
-    //   * Compute an intersection in the scene for each path ray.
-    //     A very naive version of this has been implemented for you, but feel
-    //     free to add more primitives and/or a better algorithm.
-    //     Currently, intersection distance is recorded as a parametric distance,
-    //     t, or a "distance along the ray." t = -1.0 indicates no intersection.
-    //     * Color is attenuated (multiplied) by reflections off of any object
-    //   * TODO: Stream compact away all of the terminated paths.
-    //     You may use either your implementation or `thrust::remove_if` or its
-    //     cousins.
-    //     * Note that you can't really use a 2D kernel launch any more - switch
-    //       to 1D.
-    //   * TODO: Shade the rays that intersected something or didn't bottom out.
-    //     That is, color the ray by performing a color computation according
-    //     to the shader, then generate a new ray to continue the ray path.
-    //     We recommend just updating the ray's PathSegment in place.
-    //     Note that this step may come before or after stream compaction,
-    //     since some shaders you write may also cause a path to terminate.
-    // * Finally, add this iteration's results to the image. This has been done
-    //   for you.
-
-    // TODO: perform one iteration of path tracing
-
-	generateRayFromCamera <<<blocksPerGrid2d, blockSize2d >>>(cam, iter, traceDepth, dev_paths);
-	checkCUDAError("generate camera ray");
-
-	int depth = 0;
-	PathSegment* dev_path_end = dev_paths + pixelcount;
-	int num_paths = dev_path_end - dev_paths;
-
-	// --- PathSegment Tracing Stage ---
-	// Shoot ray into scene, bounce between objects, push shading chunks
+  const int traceDepth = hst_scene->state.traceDepth;
+  const Camera &cam = hst_scene->state.camera;
+#if ANTIALIAS_SAMPLE_SIDE == 0
+  const int pixelcount = cam.resolution.x * cam.resolution.y;
+  const int actual_pixelcount = pixelcount;
+#else
+  const int actual_pixelcount = cam.resolution.x * cam.resolution.y;
+  const int pixelcount = actual_pixelcount * ANTIALIAS_SAMPLE_SIDE * ANTIALIAS_SAMPLE_SIDE;
+#endif
+
+  // 2D block for generating ray from camera
+  const dim3 blockSize2d(8, 8);
+  const dim3 blocksPerGrid2d(
+    (cam.resolution.x + blockSize2d.x - 1) / blockSize2d.x,
+    (cam.resolution.y + blockSize2d.y - 1) / blockSize2d.y);
+
+  // 1D block for path tracing
+  const int blockSize1d = 128;
+
+  ///////////////////////////////////////////////////////////////////////////
+
+  // Recap:
+  // * Initialize array of path rays (using rays that come out of the camera)
+  //   * You can pass the Camera object to that kernel.
+  //   * Each path ray must carry at minimum a (ray, color) pair,
+  //   * where color starts as the multiplicative identity, white = (1, 1, 1).
+  //   * This has already been done for you.
+  // * For each depth:
+  //   * Compute an intersection in the scene for each path ray.
+  //     A very naive version of this has been implemented for you, but feel
+  //     free to add more primitives and/or a better algorithm.
+  //     Currently, intersection distance is recorded as a parametric distance,
+  //     t, or a "distance along the ray." t = -1.0 indicates no intersection.
+  //     * Color is attenuated (multiplied) by reflections off of any object
+  //   * TODO: Stream compact away all of the terminated paths.
+  //     You may use either your implementation or `thrust::remove_if` or its
+  //     cousins.
+  //     * Note that you can't really use a 2D kernel launch any more - switch
+  //       to 1D.
+  //   * TODO: Shade the rays that intersected something or didn't bottom out.
+  //     That is, color the ray by performing a color computation according
+  //     to the shader, then generate a new ray to continue the ray path.
+  //     We recommend just updating the ray's PathSegment in place.
+  //     Note that this step may come before or after stream compaction,
+  //     since some shaders you write may also cause a path to terminate.
+  // * Finally, add this iteration's results to the image. This has been done
+  //   for you.
+
+  // 
+  int depth = 0;
+#if (CACHEFIRSTBOUNCE == 1 && ANTIALIAS_SAMPLE_SIDE == 0)
+  if (iter > 1) {
+    cudaMemcpy(dev_paths, dev_cached_paths, pixelcount * sizeof(PathSegment), cudaMemcpyDeviceToDevice);
+  }
+  else {
+#endif
+    generateRayFromCamera << <blocksPerGrid2d, blockSize2d >> >(cam, iter, traceDepth, dev_paths);
+    checkCUDAError("generate camera ray");
+#if (CACHEFIRSTBOUNCE == 1 && ANTIALIAS_SAMPLE_SIDE == 0)
+  }
+#endif
+
+
+  PathSegment* dev_path_end = dev_paths + pixelcount;
+  int num_paths = dev_path_end - dev_paths;
+  int num_pathsInFlight = num_paths;
+
+  thrust::device_ptr<PathSegment> dev_thrust_paths(dev_paths);
+
+  // --- PathSegment Tracing Stage ---
+  // Shoot ray into scene, bounce between objects, push shading chunks
 
   bool iterationComplete = false;
-	while (!iterationComplete) {
-
-	// clean shading chunks
-	cudaMemset(dev_intersections, 0, pixelcount * sizeof(ShadeableIntersection));
-
-	// tracing
-	dim3 numblocksPathSegmentTracing = (num_paths + blockSize1d - 1) / blockSize1d;
-	computeIntersections <<<numblocksPathSegmentTracing, blockSize1d>>> (
-		depth
-		, num_paths
-		, dev_paths
-		, dev_geoms
-		, hst_scene->geoms.size()
-		, dev_intersections
-		);
-	checkCUDAError("trace one bounce");
-	cudaDeviceSynchronize();
-	depth++;
-
-
-	// TODO:
-	// --- Shading Stage ---
-	// Shade path segments based on intersections and generate new rays by
-  // evaluating the BSDF.
-  // Start off with just a big kernel that handles all the different
-  // materials you have in the scenefile.
-  // TODO: compare between directly shading the path segments and shading
-  // path segments that have been reshuffled to be contiguous in memory.
-
-  shadeFakeMaterial<<<numblocksPathSegmentTracing, blockSize1d>>> (
-    iter,
-    num_paths,
-    dev_intersections,
-    dev_paths,
-    dev_materials
-  );
-  iterationComplete = true; // TODO: should be based off stream compaction results.
-	}
+  while (!iterationComplete) {
+    printf("Depth %d: %d paths\n", depth, num_pathsInFlight);
+    dim3 numblocksPathSegmentTracing = (num_pathsInFlight + blockSize1d - 1) / blockSize1d;
+
+    // tracing
+#if (CACHEFIRSTBOUNCE == 1 && ANTIALIAS_SAMPLE_SIDE == 0)
+    if (iter > 1 || depth > 0) {
+#endif
+
+      // spheres and boxes
+      pathTraceSphereBox << <numblocksPathSegmentTracing, blockSize1d >> > (
+        depth
+        , num_pathsInFlight
+        , dev_paths
+        , dev_geoms
+        , numGeoms
+        , dev_meshes
+        , dev_triangles
+        );
+      checkCUDAError("trace one bounce");
+      cudaDeviceSynchronize();
+#if CALCMESHSEPARATELY == 1
+      // meshes
+      dim3 blockSize3d(8, 8, 8);
+      if (hst_scene->meshGeoms.size() > 0) {
+        dim3 meshBoxTracing(
+          (num_pathsInFlight + blockSize3d.x - 1) / blockSize3d.x,
+          (hst_scene->meshGeoms.size() + blockSize3d.y - 1) / blockSize3d.y,
+          (GRID_FULL + blockSize3d.z - 1) / blockSize3d.z
+          );
+        kernCalculateMeshBoundingBoxIntersections << <meshBoxTracing, blockSize3d >> >(
+          num_pathsInFlight, hst_scene->meshGeoms.size(), dev_paths, dev_meshgeoms, dev_meshes, dev_path_mesh_intersections);
+        checkCUDAError("calculate bounding box intersections");
+        cudaDeviceSynchronize();
+        thrust::device_ptr<glm::ivec3> dev_thrust_path_mesh_intersections = thrust::device_pointer_cast(dev_path_mesh_intersections);
+        int numIntersections = num_pathsInFlight * hst_scene->meshGeoms.size() * GRID_FULL;
+        thrust::device_ptr<glm::ivec3> dev_thrust_path_mesh_intersections_end =
+          thrust::remove_if(dev_thrust_path_mesh_intersections, dev_thrust_path_mesh_intersections + numIntersections, IsNonintersection());
+        int numActualIntersections = dev_thrust_path_mesh_intersections_end - dev_thrust_path_mesh_intersections;
+        printf("Culled from %d to %d intersections\n", numIntersections, numActualIntersections);
+        dim3 meshTracing((numActualIntersections + blockSize1d - 1) / blockSize1d);
+        kernPathTraceMesh << <meshTracing, blockSize1d >> > (
+          numActualIntersections, dev_paths, dev_meshgeoms, dev_meshes, dev_path_mesh_intersections, dev_triangles,
+          dev_path_mesh_intersection_dists);
+        checkCUDAError("calculate ray-mesh intersections");
+        cudaDeviceSynchronize();
+
+        thrust::device_ptr<ShadeableIntersection> dev_thrust_path_mesh_intersection_dists =
+          thrust::device_pointer_cast(dev_path_mesh_intersection_dists);
+        thrust::device_ptr<glm::ivec3> dev_thrust_pm_intersection_out =
+          thrust::device_pointer_cast(dev_pm_intersection_out);
+        thrust::device_ptr<ShadeableIntersection> dev_thrust_pm_intersection_dists_out =
+          thrust::device_pointer_cast(dev_pm_intersection_dists_out);
+
+        thrust::pair<thrust::device_ptr<glm::ivec3>, thrust::device_ptr<ShadeableIntersection>> reductionResult =
+          thrust::reduce_by_key(dev_thrust_path_mesh_intersections,
+          dev_thrust_path_mesh_intersections + numActualIntersections,
+          dev_thrust_path_mesh_intersection_dists,
+          dev_thrust_pm_intersection_out,
+          dev_thrust_pm_intersection_dists_out,
+          ComparePathKey(),
+          TakeMinIntersection());
+        int numPathIntersections = reductionResult.first - dev_thrust_pm_intersection_out;
+        kernTakeMeshIntersection << <dim3(numPathIntersections + blockSize1d - 1 / blockSize1d), blockSize1d >> >(
+          numPathIntersections, dev_paths, dev_pm_intersection_out, dev_pm_intersection_dists_out);
+        checkCUDAError("get new ray-mesh intersections");
+        cudaDeviceSynchronize();
+      }
+#endif
+      
+#if (CACHEFIRSTBOUNCE == 1 && ANTIALIAS_SAMPLE_SIDE == 0)
+    }
+#endif
+
+#if (CACHEFIRSTBOUNCE == 1 && ANTIALIAS_SAMPLE_SIDE == 0)
+    // Cache first bounce
+    if (iter == 1 && depth == 0) {
+      cudaMemcpy(dev_cached_paths, dev_paths, pixelcount * sizeof(PathSegment), cudaMemcpyDeviceToDevice);
+    }
+#endif
+    depth++;
+
+    // TODO:
+    // --- Shading Stage ---
+    // Shade path segments based on intersections and generate new rays by
+    // evaluating the BSDF.
+    // Start off with just a big kernel that handles all the different
+    // materials you have in the scenefile.
+    // TODO: compare between directly shading the path segments and shading
+    // path segments that have been reshuffled to be contiguous in memory.
+#if GROUPBYMAT == 1
+    thrust::sort(dev_thrust_paths, dev_thrust_paths + num_pathsInFlight, SortByMaterial());
+#endif
+
+    simpleBSDFShader << <numblocksPathSegmentTracing, blockSize1d >> > (
+      iter,
+      num_pathsInFlight,
+      dev_paths,
+      dev_materials
+      );
+
+    dim3 numBlocksPixels = (num_pathsInFlight + blockSize1d - 1) / blockSize1d;
+    partialGather << <numBlocksPixels, blockSize1d >> >(num_pathsInFlight, dev_image, dev_paths);
+
+    thrust::device_ptr<PathSegment> dev_thrust_newPathEnd;
+    dev_thrust_newPathEnd = thrust::remove_if(
+      thrust::device,
+      dev_thrust_paths,
+      dev_thrust_paths + num_pathsInFlight,
+      HasNoBounces());
+    num_pathsInFlight = dev_thrust_newPathEnd - dev_thrust_paths;
+
+    iterationComplete = depth >= traceDepth || num_pathsInFlight == 0;
+  }
 
   // Assemble this iteration and apply it to the image
-  dim3 numBlocksPixels = (pixelcount + blockSize1d - 1) / blockSize1d;
-	finalGather<<<numBlocksPixels, blockSize1d>>>(num_paths, dev_image, dev_paths);
+#if ANTIALIAS_SAMPLE_SIDE != 0
+  dim3 numBlocksPixels = (actual_pixelcount + blockSize1d - 1) / blockSize1d;
+  kernAntialiasGather << <numBlocksPixels, blockSize1d >> >(actual_pixelcount, dev_final_image, dev_image);
+#endif
+
 
-    ///////////////////////////////////////////////////////////////////////////
+  ///////////////////////////////////////////////////////////////////////////
 
-    // Send results to OpenGL buffer for rendering
-    sendImageToPBO<<<blocksPerGrid2d, blockSize2d>>>(pbo, cam.resolution, iter, dev_image);
+  // Send results to OpenGL buffer for rendering
+  sendImageToPBO << <blocksPerGrid2d, blockSize2d >> >(pbo, cam.resolution, iter, dev_final_image);
 
-    // Retrieve image from GPU
-    cudaMemcpy(hst_scene->state.image.data(), dev_image,
-            pixelcount * sizeof(glm::vec3), cudaMemcpyDeviceToHost);
+  // Retrieve image from GPU
+  cudaMemcpy(hst_scene->state.image.data(), dev_final_image,
+    actual_pixelcount * sizeof(glm::vec3), cudaMemcpyDeviceToHost);
 
-    checkCUDAError("pathtrace");
+  checkCUDAError("pathtrace");
 }
diff --git a/src/scene.cpp b/src/scene.cpp
index cbae043..198d843 100644
--- a/src/scene.cpp
+++ b/src/scene.cpp
@@ -4,185 +4,293 @@
 #include <glm/gtc/matrix_inverse.hpp>
 #include <glm/gtx/string_cast.hpp>
 
+#define TINYOBJLOADER_IMPLEMENTATION
+#include "tiny_obj_loader.h"
+
+struct SortGeomByType {
+  bool operator()(Geom & g1, Geom & g2) {
+    return g1.type < g2.type;
+  }
+};
+
 Scene::Scene(string filename) {
-    cout << "Reading scene from " << filename << " ..." << endl;
-    cout << " " << endl;
-    char* fname = (char*)filename.c_str();
-    fp_in.open(fname);
-    if (!fp_in.is_open()) {
-        cout << "Error reading from file - aborting!" << endl;
-        throw;
-    }
-    while (fp_in.good()) {
-        string line;
-        utilityCore::safeGetline(fp_in, line);
-        if (!line.empty()) {
-            vector<string> tokens = utilityCore::tokenizeString(line);
-            if (strcmp(tokens[0].c_str(), "MATERIAL") == 0) {
-                loadMaterial(tokens[1]);
-                cout << " " << endl;
-            } else if (strcmp(tokens[0].c_str(), "OBJECT") == 0) {
-                loadGeom(tokens[1]);
-                cout << " " << endl;
-            } else if (strcmp(tokens[0].c_str(), "CAMERA") == 0) {
-                loadCamera();
-                cout << " " << endl;
-            }
+	cout << "Reading scene from " << filename << " ..." << endl;
+	cout << " " << endl;
+	char* fname = (char*)filename.c_str();
+	fp_in.open(fname);
+	if (!fp_in.is_open()) {
+		cout << "Error reading from file - aborting!" << endl;
+		throw;
+	}
+	while (fp_in.good()) {
+		string line;
+		utilityCore::safeGetline(fp_in, line);
+		if (!line.empty()) {
+			vector<string> tokens = utilityCore::tokenizeString(line);
+			if (strcmp(tokens[0].c_str(), "MATERIAL") == 0) {
+				loadMaterial(tokens[1]);
+				cout << " " << endl;
+			}
+			else if (strcmp(tokens[0].c_str(), "OBJECT") == 0) {
+				loadGeom(tokens[1]);
+				cout << " " << endl;
+			}
+			else if (strcmp(tokens[0].c_str(), "CAMERA") == 0) {
+				loadCamera();
+				cout << " " << endl;
+			}
+      else if (strcmp(tokens[0].c_str(), "MESH") == 0) {
+        loadMesh(tokens[1]);
+        cout << " " << endl;
+      }
+		}
+	}
+}
+
+int Scene::loadMesh(string meshid) {
+  int id = atoi(meshid.c_str());
+  if (id != meshes.size()) {
+    cout << "ERROR: MESH ID does not match expected number of meshes" << endl;
+    return -1;
+  }
+  else {
+    cout << "Loading Mesh " << id << "..." << endl;
+    string line;
+    utilityCore::safeGetline(fp_in, line);
+    if (!line.empty() && fp_in.good()) {
+      tinyobj::attrib_t attrib;
+      vector<tinyobj::shape_t> shapes;
+      vector<tinyobj::material_t> materials;
+      string err;
+      cout << "Loading mesh file: " << line << endl;
+      bool ret = tinyobj::LoadObj(&attrib, &shapes, &materials, &err, line.c_str());
+      if (!err.empty()) {
+        cout << "Error: " << err << endl;
+      }
+      if (!ret) {
+        cout << "Could not load!" << endl;
+      }
+      int startCount = triangles.size();
+
+      for (int i = 0; i < shapes.size(); i++) {
+        int index_offset = 0;
+        for (int j = 0; j < shapes[i].mesh.num_face_vertices.size(); j++) {
+          int fv = shapes[i].mesh.num_face_vertices[j];
+          if (fv != 3) {
+            cout << "Non-triangular face!" << endl;
+            index_offset += fv;
+            continue;
+          }
+          glm::vec3 verts[3];
+          for (int k = 0; k < 3; k++) {
+            tinyobj::index_t idx = shapes[i].mesh.indices[index_offset + k];
+            verts[k] = glm::vec3(
+              attrib.vertices[3 * idx.vertex_index + 0],
+              attrib.vertices[3 * idx.vertex_index + 1],
+              attrib.vertices[3 * idx.vertex_index + 2]
+            );
+          }
+          index_offset += fv;
+          triangles.push_back(Triangle(verts[0], verts[1], verts[2]));
         }
+      }
+      int endCount = triangles.size();
+      meshes.push_back(Mesh(startCount, endCount, triangles, triangles));
+    }
+    else {
+      cout << "Could not read mesh obj file!" << endl;
+      return -1;
     }
+  }
 }
 
 int Scene::loadGeom(string objectid) {
-    int id = atoi(objectid.c_str());
-    if (id != geoms.size()) {
-        cout << "ERROR: OBJECT ID does not match expected number of geoms" << endl;
-        return -1;
-    } else {
-        cout << "Loading Geom " << id << "..." << endl;
-        Geom newGeom;
-        string line;
-
-        //load object type
-        utilityCore::safeGetline(fp_in, line);
-        if (!line.empty() && fp_in.good()) {
-            if (strcmp(line.c_str(), "sphere") == 0) {
-                cout << "Creating new sphere..." << endl;
-                newGeom.type = SPHERE;
-            } else if (strcmp(line.c_str(), "cube") == 0) {
-                cout << "Creating new cube..." << endl;
-                newGeom.type = CUBE;
-            }
-        }
+	int id = atoi(objectid.c_str());
+	if (id != geoms.size() + meshGeoms.size()) {
+		cout << "ERROR: OBJECT ID does not match expected number of geoms" << endl;
+		return -1;
+	}
+	else {
+		cout << "Loading Geom " << id << "..." << endl;
+		Geom newGeom;
+		string line;
 
-        //link material
-        utilityCore::safeGetline(fp_in, line);
-        if (!line.empty() && fp_in.good()) {
-            vector<string> tokens = utilityCore::tokenizeString(line);
-            newGeom.materialid = atoi(tokens[1].c_str());
-            cout << "Connecting Geom " << objectid << " to Material " << newGeom.materialid << "..." << endl;
-        }
+		//load object type
+		utilityCore::safeGetline(fp_in, line);
+		if (!line.empty() && fp_in.good()) {
+			if (strcmp(line.c_str(), "sphere") == 0) {
+				cout << "Creating new sphere..." << endl;
+				newGeom.type = SPHERE;
+			}
+			else if (strcmp(line.c_str(), "cube") == 0) {
+				cout << "Creating new cube..." << endl;
+				newGeom.type = CUBE;
+			}
+			else if (strcmp(line.c_str(), "mesh") == 0) {
+				cout << "Creating new mesh..." << endl;
+				newGeom.type = MESH;
+			}
+		}
 
-        //load transformations
-        utilityCore::safeGetline(fp_in, line);
-        while (!line.empty() && fp_in.good()) {
-            vector<string> tokens = utilityCore::tokenizeString(line);
-
-            //load tranformations
-            if (strcmp(tokens[0].c_str(), "TRANS") == 0) {
-                newGeom.translation = glm::vec3(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str()));
-            } else if (strcmp(tokens[0].c_str(), "ROTAT") == 0) {
-                newGeom.rotation = glm::vec3(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str()));
-            } else if (strcmp(tokens[0].c_str(), "SCALE") == 0) {
-                newGeom.scale = glm::vec3(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str()));
-            }
-
-            utilityCore::safeGetline(fp_in, line);
-        }
+		// load mesh ID
+		if (newGeom.type == MESH) {
+			utilityCore::safeGetline(fp_in, line);
+			if (!line.empty() && fp_in.good()) {
+				vector<string> tokens = utilityCore::tokenizeString(line);
+				newGeom.meshid = atoi(tokens[1].c_str());
+				cout << "Connecting Geom " << objectid << " to Mesh " << newGeom.meshid << "..." << endl;
+			}
+		}
+
+		//link material
+		utilityCore::safeGetline(fp_in, line);
+		if (!line.empty() && fp_in.good()) {
+			vector<string> tokens = utilityCore::tokenizeString(line);
+			newGeom.materialid = atoi(tokens[1].c_str());
+			cout << "Connecting Geom " << objectid << " to Material " << newGeom.materialid << "..." << endl;
+		}
+
+		//load transformations
+		utilityCore::safeGetline(fp_in, line);
+		while (!line.empty() && fp_in.good()) {
+			vector<string> tokens = utilityCore::tokenizeString(line);
 
-        newGeom.transform = utilityCore::buildTransformationMatrix(
-                newGeom.translation, newGeom.rotation, newGeom.scale);
-        newGeom.inverseTransform = glm::inverse(newGeom.transform);
-        newGeom.invTranspose = glm::inverseTranspose(newGeom.transform);
+			//load tranformations
+			if (strcmp(tokens[0].c_str(), "TRANS") == 0) {
+				newGeom.translation = glm::vec3(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str()));
+			}
+			else if (strcmp(tokens[0].c_str(), "ROTAT") == 0) {
+				newGeom.rotation = glm::vec3(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str()));
+			}
+			else if (strcmp(tokens[0].c_str(), "SCALE") == 0) {
+				newGeom.scale = glm::vec3(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str()));
+			}
 
-        geoms.push_back(newGeom);
-        return 1;
+			utilityCore::safeGetline(fp_in, line);
+		}
+
+		newGeom.transform = utilityCore::buildTransformationMatrix(
+			newGeom.translation, newGeom.rotation, newGeom.scale);
+		newGeom.inverseTransform = glm::inverse(newGeom.transform);
+		newGeom.invTranspose = glm::inverseTranspose(newGeom.transform);
+
+    if (newGeom.type == MESH) {
+      meshGeoms.push_back(newGeom);
     }
+    else {
+      geoms.push_back(newGeom);
+    }
+		return 1;
+	}
 }
 
 int Scene::loadCamera() {
-    cout << "Loading Camera ..." << endl;
-    RenderState &state = this->state;
-    Camera &camera = state.camera;
-    float fovy;
-
-    //load static properties
-    for (int i = 0; i < 5; i++) {
-        string line;
-        utilityCore::safeGetline(fp_in, line);
-        vector<string> tokens = utilityCore::tokenizeString(line);
-        if (strcmp(tokens[0].c_str(), "RES") == 0) {
-            camera.resolution.x = atoi(tokens[1].c_str());
-            camera.resolution.y = atoi(tokens[2].c_str());
-        } else if (strcmp(tokens[0].c_str(), "FOVY") == 0) {
-            fovy = atof(tokens[1].c_str());
-        } else if (strcmp(tokens[0].c_str(), "ITERATIONS") == 0) {
-            state.iterations = atoi(tokens[1].c_str());
-        } else if (strcmp(tokens[0].c_str(), "DEPTH") == 0) {
-            state.traceDepth = atoi(tokens[1].c_str());
-        } else if (strcmp(tokens[0].c_str(), "FILE") == 0) {
-            state.imageName = tokens[1];
-        }
-    }
+	cout << "Loading Camera ..." << endl;
+	RenderState &state = this->state;
+	Camera &camera = state.camera;
+	float fovy;
 
-    string line;
-    utilityCore::safeGetline(fp_in, line);
-    while (!line.empty() && fp_in.good()) {
-        vector<string> tokens = utilityCore::tokenizeString(line);
-        if (strcmp(tokens[0].c_str(), "EYE") == 0) {
-            camera.position = glm::vec3(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str()));
-        } else if (strcmp(tokens[0].c_str(), "LOOKAT") == 0) {
-            camera.lookAt = glm::vec3(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str()));
-        } else if (strcmp(tokens[0].c_str(), "UP") == 0) {
-            camera.up = glm::vec3(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str()));
-        }
+	//load static properties
+	for (int i = 0; i < 5; i++) {
+		string line;
+		utilityCore::safeGetline(fp_in, line);
+		vector<string> tokens = utilityCore::tokenizeString(line);
+		if (strcmp(tokens[0].c_str(), "RES") == 0) {
+			camera.resolution.x = atoi(tokens[1].c_str());
+			camera.resolution.y = atoi(tokens[2].c_str());
+		}
+		else if (strcmp(tokens[0].c_str(), "FOVY") == 0) {
+			fovy = atof(tokens[1].c_str());
+		}
+		else if (strcmp(tokens[0].c_str(), "ITERATIONS") == 0) {
+			state.iterations = atoi(tokens[1].c_str());
+		}
+		else if (strcmp(tokens[0].c_str(), "DEPTH") == 0) {
+			state.traceDepth = atoi(tokens[1].c_str());
+		}
+		else if (strcmp(tokens[0].c_str(), "FILE") == 0) {
+			state.imageName = tokens[1];
+		}
+	}
 
-        utilityCore::safeGetline(fp_in, line);
-    }
+	string line;
+	utilityCore::safeGetline(fp_in, line);
+	while (!line.empty() && fp_in.good()) {
+		vector<string> tokens = utilityCore::tokenizeString(line);
+		if (strcmp(tokens[0].c_str(), "EYE") == 0) {
+			camera.position = glm::vec3(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str()));
+		}
+		else if (strcmp(tokens[0].c_str(), "LOOKAT") == 0) {
+			camera.lookAt = glm::vec3(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str()));
+		}
+		else if (strcmp(tokens[0].c_str(), "UP") == 0) {
+			camera.up = glm::vec3(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str()));
+		}
+
+		utilityCore::safeGetline(fp_in, line);
+	}
 
-    //calculate fov based on resolution
-    float yscaled = tan(fovy * (PI / 180));
-    float xscaled = (yscaled * camera.resolution.x) / camera.resolution.y;
-    float fovx = (atan(xscaled) * 180) / PI;
-    camera.fov = glm::vec2(fovx, fovy);
+	//calculate fov based on resolution
+	float yscaled = tan(fovy * (PI / 180));
+	float xscaled = (yscaled * camera.resolution.x) / camera.resolution.y;
+	float fovx = (atan(xscaled) * 180) / PI;
+	camera.fov = glm::vec2(fovx, fovy);
 
 	camera.right = glm::normalize(glm::cross(camera.view, camera.up));
 	camera.pixelLength = glm::vec2(2 * xscaled / (float)camera.resolution.x
-							, 2 * yscaled / (float)camera.resolution.y);
+		, 2 * yscaled / (float)camera.resolution.y);
 
-    camera.view = glm::normalize(camera.lookAt - camera.position);
+	camera.view = glm::normalize(camera.lookAt - camera.position);
 
-    //set up render camera stuff
-    int arraylen = camera.resolution.x * camera.resolution.y;
-    state.image.resize(arraylen);
-    std::fill(state.image.begin(), state.image.end(), glm::vec3());
+	//set up render camera stuff
+	int arraylen = camera.resolution.x * camera.resolution.y;
+	state.image.resize(arraylen);
+	std::fill(state.image.begin(), state.image.end(), glm::vec3());
 
-    cout << "Loaded camera!" << endl;
-    return 1;
+	cout << "Loaded camera!" << endl;
+	return 1;
 }
 
 int Scene::loadMaterial(string materialid) {
-    int id = atoi(materialid.c_str());
-    if (id != materials.size()) {
-        cout << "ERROR: MATERIAL ID does not match expected number of materials" << endl;
-        return -1;
-    } else {
-        cout << "Loading Material " << id << "..." << endl;
-        Material newMaterial;
-
-        //load static properties
-        for (int i = 0; i < 7; i++) {
-            string line;
-            utilityCore::safeGetline(fp_in, line);
-            vector<string> tokens = utilityCore::tokenizeString(line);
-            if (strcmp(tokens[0].c_str(), "RGB") == 0) {
-                glm::vec3 color( atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str()) );
-                newMaterial.color = color;
-            } else if (strcmp(tokens[0].c_str(), "SPECEX") == 0) {
-                newMaterial.specular.exponent = atof(tokens[1].c_str());
-            } else if (strcmp(tokens[0].c_str(), "SPECRGB") == 0) {
-                glm::vec3 specColor(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str()));
-                newMaterial.specular.color = specColor;
-            } else if (strcmp(tokens[0].c_str(), "REFL") == 0) {
-                newMaterial.hasReflective = atof(tokens[1].c_str());
-            } else if (strcmp(tokens[0].c_str(), "REFR") == 0) {
-                newMaterial.hasRefractive = atof(tokens[1].c_str());
-            } else if (strcmp(tokens[0].c_str(), "REFRIOR") == 0) {
-                newMaterial.indexOfRefraction = atof(tokens[1].c_str());
-            } else if (strcmp(tokens[0].c_str(), "EMITTANCE") == 0) {
-                newMaterial.emittance = atof(tokens[1].c_str());
-            }
-        }
-        materials.push_back(newMaterial);
-        return 1;
-    }
+	int id = atoi(materialid.c_str());
+	if (id != materials.size()) {
+		cout << "ERROR: MATERIAL ID does not match expected number of materials" << endl;
+		return -1;
+	}
+	else {
+		cout << "Loading Material " << id << "..." << endl;
+		Material newMaterial;
+
+		//load static properties
+		for (int i = 0; i < 7; i++) {
+			string line;
+			utilityCore::safeGetline(fp_in, line);
+			vector<string> tokens = utilityCore::tokenizeString(line);
+			if (strcmp(tokens[0].c_str(), "RGB") == 0) {
+				glm::vec3 color(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str()));
+				newMaterial.color = color;
+			}
+			else if (strcmp(tokens[0].c_str(), "SPECEX") == 0) {
+				newMaterial.specular.exponent = atof(tokens[1].c_str());
+			}
+			else if (strcmp(tokens[0].c_str(), "SPECRGB") == 0) {
+				glm::vec3 specColor(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str()));
+				newMaterial.specular.color = specColor;
+			}
+			else if (strcmp(tokens[0].c_str(), "REFL") == 0) {
+				newMaterial.hasReflective = atof(tokens[1].c_str());
+			}
+			else if (strcmp(tokens[0].c_str(), "REFR") == 0) {
+				newMaterial.hasRefractive = atof(tokens[1].c_str());
+			}
+			else if (strcmp(tokens[0].c_str(), "REFRIOR") == 0) {
+				newMaterial.indexOfRefraction = atof(tokens[1].c_str());
+			}
+			else if (strcmp(tokens[0].c_str(), "EMITTANCE") == 0) {
+				newMaterial.emittance = atof(tokens[1].c_str());
+			}
+		}
+		materials.push_back(newMaterial);
+		return 1;
+	}
 }
diff --git a/src/scene.h b/src/scene.h
index f29a917..452ac41 100644
--- a/src/scene.h
+++ b/src/scene.h
@@ -12,15 +12,19 @@ using namespace std;
 
 class Scene {
 private:
-    ifstream fp_in;
-    int loadMaterial(string materialid);
-    int loadGeom(string objectid);
-    int loadCamera();
+  ifstream fp_in;
+  int loadMaterial(string materialid);
+  int loadGeom(string objectid);
+  int loadMesh(string meshid);
+  int loadCamera();
 public:
-    Scene(string filename);
-    ~Scene();
+  Scene(string filename);
+  ~Scene();
 
-    std::vector<Geom> geoms;
-    std::vector<Material> materials;
-    RenderState state;
+  std::vector<Geom> geoms;
+  std::vector<Geom> meshGeoms;
+  std::vector<Material> materials;
+  std::vector<Mesh> meshes;
+  std::vector<Triangle> triangles;
+  RenderState state;
 };
diff --git a/src/sceneStructs.h b/src/sceneStructs.h
index b38b820..21e9691 100644
--- a/src/sceneStructs.h
+++ b/src/sceneStructs.h
@@ -4,66 +4,142 @@
 #include <vector>
 #include <cuda_runtime.h>
 #include "glm/glm.hpp"
+#include "utilities.h"
 
 #define BACKGROUND_COLOR (glm::vec3(0.0f))
 
+#define GRID_SIZE 3
+#define GRID_FULL (GRID_SIZE * GRID_SIZE * GRID_SIZE)
+
 enum GeomType {
-    SPHERE,
-    CUBE,
+  SPHERE = 0,
+  CUBE = 1,
+  MESH = 2,
 };
 
 struct Ray {
-    glm::vec3 origin;
-    glm::vec3 direction;
+  glm::vec3 origin;
+  glm::vec3 direction;
 };
 
 struct Geom {
-    enum GeomType type;
-    int materialid;
-    glm::vec3 translation;
-    glm::vec3 rotation;
-    glm::vec3 scale;
-    glm::mat4 transform;
-    glm::mat4 inverseTransform;
-    glm::mat4 invTranspose;
+  enum GeomType type;
+  int materialid;
+  glm::vec3 translation;
+  glm::vec3 rotation;
+  glm::vec3 scale;
+  glm::mat4 transform;
+  glm::mat4 inverseTransform;
+  glm::mat4 invTranspose;
+  int meshid;
+};
+
+struct Triangle {
+  glm::vec3 vertices[3];
+  glm::vec3 normal;
+
+  Triangle(glm::vec3 v1, glm::vec3 v2, glm::vec3 v3) {
+    vertices[0] = v1;
+    vertices[1] = v2;
+    vertices[2] = v3;
+    normal = glm::normalize(glm::cross(v2 - v1, v3 - v1));
+  }
+};
+
+struct Mesh {
+  int triangleStart;
+  int triangleEnd;
+
+  glm::vec3 boxMin;
+  glm::vec3 boxMax;
+
+  struct {
+    int start, end;
+  } gridIdx[GRID_FULL];
+
+  Mesh(int start, int end, std::vector<Triangle> & triangles, std::vector<Triangle> & gridTriangles) {
+    triangleStart = start;
+    triangleEnd = end;
+
+    std::vector<Triangle> grid[GRID_SIZE][GRID_SIZE][GRID_SIZE];
+
+    boxMin = glm::vec3(FLT_MAX, FLT_MAX, FLT_MAX);
+    boxMax = glm::vec3(FLT_MIN, FLT_MIN, FLT_MIN);
+
+    for (int i = triangleStart; i < triangleEnd; i++) {
+      for (int j = 0; j < 3; j++) {
+        boxMin = utilityCore::vecMin(boxMin, triangles[i].vertices[j]);
+        boxMax = utilityCore::vecMax(boxMax, triangles[i].vertices[j]);
+      }
+    }
+
+    glm::vec3 boxDim = boxMax - boxMin;
+
+    for (int i = triangleStart; i < triangleEnd; i++) {
+      glm::vec3 triangleMin = glm::vec3(FLT_MAX, FLT_MAX, FLT_MAX);
+      glm::vec3 triangleMax = glm::vec3(FLT_MIN, FLT_MIN, FLT_MIN);
+      for (int j = 0; j < 3; j++) {
+        triangleMin = utilityCore::vecMin(triangleMin, triangles[i].vertices[j]);
+        triangleMax = utilityCore::vecMax(triangleMax, triangles[i].vertices[j]);
+      }
+      for (int x = 0; x < GRID_SIZE; x++) {
+        for (int y = 0; y < GRID_SIZE; y++) {
+          for (int z = 0; z < GRID_SIZE; z++) {
+
+            glm::vec3 gridMin = boxMin + glm::vec3(x, y, z) * (boxDim / (float)GRID_SIZE);
+            glm::vec3 gridMax = boxMin + glm::vec3(x + 1, y + 1, z + 1) * (boxDim / (float)GRID_SIZE);
+
+            if (utilityCore::aabbIntersect(triangleMin, triangleMax, gridMin, gridMax)) {
+              grid[x][y][z].push_back(triangles[i]);
+            }
+          }
+        }
+      }
+    }
+    for (int z = 0; z < GRID_SIZE; z++) {
+      for (int y = 0; y < GRID_SIZE; y++) {
+        for (int x = 0; x < GRID_SIZE; x++) {
+          int idx = z * GRID_SIZE * GRID_SIZE + y * GRID_SIZE + x;
+          gridIdx[idx].start = gridTriangles.size();
+          for (int i = 0; i < grid[x][y][z].size(); i++) {
+            gridTriangles.push_back(grid[x][y][z][i]);
+          }
+          gridIdx[idx].end = gridTriangles.size();
+        }
+      }
+    }
+  }
 };
 
 struct Material {
+  glm::vec3 color;
+  struct {
+    float exponent;
     glm::vec3 color;
-    struct {
-        float exponent;
-        glm::vec3 color;
-    } specular;
-    float hasReflective;
-    float hasRefractive;
-    float indexOfRefraction;
-    float emittance;
+  } specular;
+  float hasReflective;
+  float hasRefractive;
+  float indexOfRefraction;
+  float emittance;
 };
 
 struct Camera {
-    glm::ivec2 resolution;
-    glm::vec3 position;
-    glm::vec3 lookAt;
-    glm::vec3 view;
-    glm::vec3 up;
-    glm::vec3 right;
-    glm::vec2 fov;
-    glm::vec2 pixelLength;
+  glm::ivec2 resolution;
+  glm::vec3 position;
+  glm::vec3 lookAt;
+  glm::vec3 view;
+  glm::vec3 up;
+  glm::vec3 right;
+  glm::vec2 fov;
+  glm::vec2 pixelLength;
 };
 
 struct RenderState {
-    Camera camera;
-    unsigned int iterations;
-    int traceDepth;
-    std::vector<glm::vec3> image;
-    std::string imageName;
-};
-
-struct PathSegment {
-	Ray ray;
-	glm::vec3 color;
-	int pixelIndex;
-	int remainingBounces;
+  Camera camera;
+  unsigned int iterations;
+  int traceDepth;
+  std::vector<glm::vec3> image;
+  std::string imageName;
 };
 
 // Use with a corresponding PathSegment to do:
@@ -74,3 +150,12 @@ struct ShadeableIntersection {
   glm::vec3 surfaceNormal;
   int materialId;
 };
+
+struct PathSegment {
+  Ray ray;
+  glm::vec3 color;
+  int pixelIndex;
+  int remainingBounces;
+  bool insideRefractiveObject;
+  ShadeableIntersection intersection;
+};
\ No newline at end of file
diff --git a/src/tiny_obj_loader.h b/src/tiny_obj_loader.h
new file mode 100644
index 0000000..0e9c369
--- /dev/null
+++ b/src/tiny_obj_loader.h
@@ -0,0 +1,1653 @@
+/*
+The MIT License (MIT)
+
+Copyright (c) 2012-2016 Syoyo Fujita and many contributors.
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+THE SOFTWARE.
+*/
+
+//
+// version 1.0.0 : Change data structure. Change license from BSD to MIT.
+//
+
+//
+// Use this in *one* .cc
+//   #define TINYOBJLOADER_IMPLEMENTATION
+//   #include "tiny_obj_loader.h"
+//
+
+#ifndef TINY_OBJ_LOADER_H_
+#define TINY_OBJ_LOADER_H_
+
+#include <map>
+#include <string>
+#include <vector>
+
+namespace tinyobj {
+
+	typedef struct {
+		std::string name;
+
+		float ambient[3];
+		float diffuse[3];
+		float specular[3];
+		float transmittance[3];
+		float emission[3];
+		float shininess;
+		float ior;       // index of refraction
+		float dissolve;  // 1 == opaque; 0 == fully transparent
+		// illumination model (see http://www.fileformat.info/format/material/)
+		int illum;
+
+		int dummy;  // Suppress padding warning.
+
+		std::string ambient_texname;             // map_Ka
+		std::string diffuse_texname;             // map_Kd
+		std::string specular_texname;            // map_Ks
+		std::string specular_highlight_texname;  // map_Ns
+		std::string bump_texname;                // map_bump, bump
+		std::string displacement_texname;        // disp
+		std::string alpha_texname;               // map_d
+
+		// PBR extension
+		// http://exocortex.com/blog/extending_wavefront_mtl_to_support_pbr
+		float roughness;                // [0, 1] default 0
+		float metallic;                 // [0, 1] default 0
+		float sheen;                    // [0, 1] default 0
+		float clearcoat_thickness;      // [0, 1] default 0
+		float clearcoat_roughness;      // [0, 1] default 0
+		float anisotropy;               // aniso. [0, 1] default 0
+		float anisotropy_rotation;      // anisor. [0, 1] default 0
+		std::string roughness_texname;  // map_Pr
+		std::string metallic_texname;   // map_Pm
+		std::string sheen_texname;      // map_Ps
+		std::string emissive_texname;   // map_Ke
+		std::string normal_texname;     // norm. For normal mapping.
+
+		std::map<std::string, std::string> unknown_parameter;
+	} material_t;
+
+	typedef struct {
+		std::string name;
+
+		std::vector<int> intValues;
+		std::vector<float> floatValues;
+		std::vector<std::string> stringValues;
+	} tag_t;
+
+	// Index struct to support differnt indices for vtx/normal/texcoord.
+	// -1 means not used.
+	typedef struct {
+		int vertex_index;
+		int normal_index;
+		int texcoord_index;
+	} index_t;
+
+	typedef struct {
+		std::vector<index_t> indices;
+		std::vector<unsigned char> num_face_vertices;  // The number of vertices per
+		// face. 3 = polygon, 4 = quad,
+		// ... Up to 255.
+		std::vector<int> material_ids;                 // per-face material ID
+		std::vector<tag_t> tags;                       // SubD tag
+	} mesh_t;
+
+	typedef struct {
+		std::string name;
+		mesh_t mesh;
+	} shape_t;
+
+	// Vertex attributes
+	typedef struct {
+		std::vector<float> vertices;   // 'v'
+		std::vector<float> normals;    // 'vn'
+		std::vector<float> texcoords;  // 'vt'
+	} attrib_t;
+
+	typedef struct callback_t_ {
+		// W is optional and set to 1 if there is no `w` item in `v` line
+		void(*vertex_cb)(void *user_data, float x, float y, float z, float w);
+		void(*normal_cb)(void *user_data, float x, float y, float z);
+
+		// y and z are optional and set to 0 if there is no `y` and/or `z` item(s) in
+		// `vt` line.
+		void(*texcoord_cb)(void *user_data, float x, float y, float z);
+
+		// called per 'f' line. num_indices is the number of face indices(e.g. 3 for
+		// triangle, 4 for quad)
+		// 0 will be passed for undefined index in index_t members.
+		void(*index_cb)(void *user_data, index_t *indices, int num_indices);
+		// `name` material name, `material_id` = the array index of material_t[]. -1
+		// if
+		// a material not found in .mtl
+		void(*usemtl_cb)(void *user_data, const char *name, int material_id);
+		// `materials` = parsed material data.
+		void(*mtllib_cb)(void *user_data, const material_t *materials,
+			int num_materials);
+		// There may be multiple group names
+		void(*group_cb)(void *user_data, const char **names, int num_names);
+		void(*object_cb)(void *user_data, const char *name);
+
+		callback_t_()
+			: vertex_cb(NULL),
+			normal_cb(NULL),
+			texcoord_cb(NULL),
+			index_cb(NULL),
+			usemtl_cb(NULL),
+			mtllib_cb(NULL),
+			group_cb(NULL),
+			object_cb(NULL) {}
+	} callback_t;
+
+	class MaterialReader {
+	public:
+		MaterialReader() {}
+		virtual ~MaterialReader();
+
+		virtual bool operator()(const std::string &matId,
+			std::vector<material_t> *materials,
+			std::map<std::string, int> *matMap,
+			std::string *err) = 0;
+	};
+
+	class MaterialFileReader : public MaterialReader {
+	public:
+		explicit MaterialFileReader(const std::string &mtl_basepath)
+			: m_mtlBasePath(mtl_basepath) {}
+		virtual ~MaterialFileReader() {}
+		virtual bool operator()(const std::string &matId,
+			std::vector<material_t> *materials,
+			std::map<std::string, int> *matMap, std::string *err);
+
+	private:
+		std::string m_mtlBasePath;
+	};
+
+	class MaterialStreamReader : public MaterialReader {
+	public:
+		explicit MaterialStreamReader(std::istream &inStream)
+			: m_inStream(inStream) {}
+		virtual ~MaterialStreamReader() {}
+		virtual bool operator()(const std::string &matId,
+			std::vector<material_t> *materials,
+			std::map<std::string, int> *matMap, std::string *err);
+
+	private:
+		std::istream &m_inStream;
+	};
+
+	/// Loads .obj from a file.
+	/// 'attrib', 'shapes' and 'materials' will be filled with parsed shape data
+	/// 'shapes' will be filled with parsed shape data
+	/// Returns true when loading .obj become success.
+	/// Returns warning and error message into `err`
+	/// 'mtl_basepath' is optional, and used for base path for .mtl file.
+	/// 'triangulate' is optional, and used whether triangulate polygon face in .obj
+	/// or not.
+	bool LoadObj(attrib_t *attrib, std::vector<shape_t> *shapes,
+		std::vector<material_t> *materials, std::string *err,
+		const char *filename, const char *mtl_basepath = NULL,
+		bool triangulate = true);
+
+	/// Loads .obj from a file with custom user callback.
+	/// .mtl is loaded as usual and parsed material_t data will be passed to
+	/// `callback.mtllib_cb`.
+	/// Returns true when loading .obj/.mtl become success.
+	/// Returns warning and error message into `err`
+	/// See `examples/callback_api/` for how to use this function.
+	bool LoadObjWithCallback(std::istream &inStream, const callback_t &callback,
+		void *user_data = NULL,
+		MaterialReader *readMatFn = NULL,
+		std::string *err = NULL);
+
+	/// Loads object from a std::istream, uses GetMtlIStreamFn to retrieve
+	/// std::istream for materials.
+	/// Returns true when loading .obj become success.
+	/// Returns warning and error message into `err`
+	bool LoadObj(attrib_t *attrib, std::vector<shape_t> *shapes,
+		std::vector<material_t> *materials, std::string *err,
+		std::istream *inStream, MaterialReader *readMatFn = NULL,
+		bool triangulate = true);
+
+	/// Loads materials into std::map
+	void LoadMtl(std::map<std::string, int> *material_map,
+		std::vector<material_t> *materials, std::istream *inStream);
+
+}  // namespace tinyobj
+
+#ifdef TINYOBJLOADER_IMPLEMENTATION
+#include <cassert>
+#include <cctype>
+#include <cmath>
+#include <cstddef>
+#include <cstdlib>
+#include <cstring>
+#include <utility>
+
+#include <fstream>
+#include <sstream>
+
+namespace tinyobj {
+
+	MaterialReader::~MaterialReader() {}
+
+#define TINYOBJ_SSCANF_BUFFER_SIZE (4096)
+
+	struct vertex_index {
+		int v_idx, vt_idx, vn_idx;
+		vertex_index() : v_idx(-1), vt_idx(-1), vn_idx(-1) {}
+		explicit vertex_index(int idx) : v_idx(idx), vt_idx(idx), vn_idx(idx) {}
+		vertex_index(int vidx, int vtidx, int vnidx)
+			: v_idx(vidx), vt_idx(vtidx), vn_idx(vnidx) {}
+	};
+
+	struct tag_sizes {
+		tag_sizes() : num_ints(0), num_floats(0), num_strings(0) {}
+		int num_ints;
+		int num_floats;
+		int num_strings;
+	};
+
+	struct obj_shape {
+		std::vector<float> v;
+		std::vector<float> vn;
+		std::vector<float> vt;
+	};
+
+	// See
+	// http://stackoverflow.com/questions/6089231/getting-std-ifstream-to-handle-lf-cr-and-crlf
+	static std::istream &safeGetline(std::istream &is, std::string &t) {
+		t.clear();
+
+		// The characters in the stream are read one-by-one using a std::streambuf.
+		// That is faster than reading them one-by-one using the std::istream.
+		// Code that uses streambuf this way must be guarded by a sentry object.
+		// The sentry object performs various tasks,
+		// such as thread synchronization and updating the stream state.
+
+		std::istream::sentry se(is, true);
+		std::streambuf *sb = is.rdbuf();
+
+		for (;;) {
+			int c = sb->sbumpc();
+			switch (c) {
+			case '\n':
+				return is;
+			case '\r':
+				if (sb->sgetc() == '\n') sb->sbumpc();
+				return is;
+			case EOF:
+				// Also handle the case when the last line has no line ending
+				if (t.empty()) is.setstate(std::ios::eofbit);
+				return is;
+			default:
+				t += static_cast<char>(c);
+			}
+		}
+	}
+
+#define IS_SPACE(x) (((x) == ' ') || ((x) == '\t'))
+#define IS_DIGIT(x) \
+  (static_cast<unsigned int>((x) - '0') < static_cast<unsigned int>(10))
+#define IS_NEW_LINE(x) (((x) == '\r') || ((x) == '\n') || ((x) == '\0'))
+
+	// Make index zero-base, and also support relative index.
+	static inline int fixIndex(int idx, int n) {
+		if (idx > 0) return idx - 1;
+		if (idx == 0) return 0;
+		return n + idx;  // negative value = relative
+	}
+
+	static inline std::string parseString(const char **token) {
+		std::string s;
+		(*token) += strspn((*token), " \t");
+		size_t e = strcspn((*token), " \t\r");
+		s = std::string((*token), &(*token)[e]);
+		(*token) += e;
+		return s;
+	}
+
+	static inline int parseInt(const char **token) {
+		(*token) += strspn((*token), " \t");
+		int i = atoi((*token));
+		(*token) += strcspn((*token), " \t\r");
+		return i;
+	}
+
+	// Tries to parse a floating point number located at s.
+	//
+	// s_end should be a location in the string where reading should absolutely
+	// stop. For example at the end of the string, to prevent buffer overflows.
+	//
+	// Parses the following EBNF grammar:
+	//   sign    = "+" | "-" ;
+	//   END     = ? anything not in digit ?
+	//   digit   = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ;
+	//   integer = [sign] , digit , {digit} ;
+	//   decimal = integer , ["." , integer] ;
+	//   float   = ( decimal , END ) | ( decimal , ("E" | "e") , integer , END ) ;
+	//
+	//  Valid strings are for example:
+	//   -0  +3.1417e+2  -0.0E-3  1.0324  -1.41   11e2
+	//
+	// If the parsing is a success, result is set to the parsed value and true
+	// is returned.
+	//
+	// The function is greedy and will parse until any of the following happens:
+	//  - a non-conforming character is encountered.
+	//  - s_end is reached.
+	//
+	// The following situations triggers a failure:
+	//  - s >= s_end.
+	//  - parse failure.
+	//
+	static bool tryParseDouble(const char *s, const char *s_end, double *result) {
+		if (s >= s_end) {
+			return false;
+		}
+
+		double mantissa = 0.0;
+		// This exponent is base 2 rather than 10.
+		// However the exponent we parse is supposed to be one of ten,
+		// thus we must take care to convert the exponent/and or the
+		// mantissa to a * 2^E, where a is the mantissa and E is the
+		// exponent.
+		// To get the final double we will use ldexp, it requires the
+		// exponent to be in base 2.
+		int exponent = 0;
+
+		// NOTE: THESE MUST BE DECLARED HERE SINCE WE ARE NOT ALLOWED
+		// TO JUMP OVER DEFINITIONS.
+		char sign = '+';
+		char exp_sign = '+';
+		char const *curr = s;
+
+		// How many characters were read in a loop.
+		int read = 0;
+		// Tells whether a loop terminated due to reaching s_end.
+		bool end_not_reached = false;
+
+		/*
+		BEGIN PARSING.
+		*/
+
+		// Find out what sign we've got.
+		if (*curr == '+' || *curr == '-') {
+			sign = *curr;
+			curr++;
+		}
+		else if (IS_DIGIT(*curr)) { /* Pass through. */
+		}
+		else {
+			goto fail;
+		}
+
+		// Read the integer part.
+		end_not_reached = (curr != s_end);
+		while (end_not_reached && IS_DIGIT(*curr)) {
+			mantissa *= 10;
+			mantissa += static_cast<int>(*curr - 0x30);
+			curr++;
+			read++;
+			end_not_reached = (curr != s_end);
+		}
+
+		// We must make sure we actually got something.
+		if (read == 0) goto fail;
+		// We allow numbers of form "#", "###" etc.
+		if (!end_not_reached) goto assemble;
+
+		// Read the decimal part.
+		if (*curr == '.') {
+			curr++;
+			read = 1;
+			end_not_reached = (curr != s_end);
+			while (end_not_reached && IS_DIGIT(*curr)) {
+				// NOTE: Don't use powf here, it will absolutely murder precision.
+				mantissa += static_cast<int>(*curr - 0x30) * pow(10.0, -read);
+				read++;
+				curr++;
+				end_not_reached = (curr != s_end);
+			}
+		}
+		else if (*curr == 'e' || *curr == 'E') {
+		}
+		else {
+			goto assemble;
+		}
+
+		if (!end_not_reached) goto assemble;
+
+		// Read the exponent part.
+		if (*curr == 'e' || *curr == 'E') {
+			curr++;
+			// Figure out if a sign is present and if it is.
+			end_not_reached = (curr != s_end);
+			if (end_not_reached && (*curr == '+' || *curr == '-')) {
+				exp_sign = *curr;
+				curr++;
+			}
+			else if (IS_DIGIT(*curr)) { /* Pass through. */
+			}
+			else {
+				// Empty E is not allowed.
+				goto fail;
+			}
+
+			read = 0;
+			end_not_reached = (curr != s_end);
+			while (end_not_reached && IS_DIGIT(*curr)) {
+				exponent *= 10;
+				exponent += static_cast<int>(*curr - 0x30);
+				curr++;
+				read++;
+				end_not_reached = (curr != s_end);
+			}
+			exponent *= (exp_sign == '+' ? 1 : -1);
+			if (read == 0) goto fail;
+		}
+
+	assemble:
+		*result =
+			(sign == '+' ? 1 : -1) * ldexp(mantissa * pow(5.0, exponent), exponent);
+		return true;
+	fail:
+		return false;
+	}
+
+	static inline float parseFloat(const char **token, double default_value = 0.0) {
+		(*token) += strspn((*token), " \t");
+		const char *end = (*token) + strcspn((*token), " \t\r");
+		double val = default_value;
+		tryParseDouble((*token), end, &val);
+		float f = static_cast<float>(val);
+		(*token) = end;
+		return f;
+	}
+
+	static inline void parseFloat2(float *x, float *y, const char **token) {
+		(*x) = parseFloat(token);
+		(*y) = parseFloat(token);
+	}
+
+	static inline void parseFloat3(float *x, float *y, float *z,
+		const char **token) {
+		(*x) = parseFloat(token);
+		(*y) = parseFloat(token);
+		(*z) = parseFloat(token);
+	}
+
+	static inline void parseV(float *x, float *y, float *z, float *w,
+		const char **token) {
+		(*x) = parseFloat(token);
+		(*y) = parseFloat(token);
+		(*z) = parseFloat(token);
+		(*w) = parseFloat(token, 1.0);
+	}
+
+	static tag_sizes parseTagTriple(const char **token) {
+		tag_sizes ts;
+
+		ts.num_ints = atoi((*token));
+		(*token) += strcspn((*token), "/ \t\r");
+		if ((*token)[0] != '/') {
+			return ts;
+		}
+		(*token)++;
+
+		ts.num_floats = atoi((*token));
+		(*token) += strcspn((*token), "/ \t\r");
+		if ((*token)[0] != '/') {
+			return ts;
+		}
+		(*token)++;
+
+		ts.num_strings = atoi((*token));
+		(*token) += strcspn((*token), "/ \t\r") + 1;
+
+		return ts;
+	}
+
+	// Parse triples with index offsets: i, i/j/k, i//k, i/j
+	static vertex_index parseTriple(const char **token, int vsize, int vnsize,
+		int vtsize) {
+		vertex_index vi(-1);
+
+		vi.v_idx = fixIndex(atoi((*token)), vsize);
+		(*token) += strcspn((*token), "/ \t\r");
+		if ((*token)[0] != '/') {
+			return vi;
+		}
+		(*token)++;
+
+		// i//k
+		if ((*token)[0] == '/') {
+			(*token)++;
+			vi.vn_idx = fixIndex(atoi((*token)), vnsize);
+			(*token) += strcspn((*token), "/ \t\r");
+			return vi;
+		}
+
+		// i/j/k or i/j
+		vi.vt_idx = fixIndex(atoi((*token)), vtsize);
+		(*token) += strcspn((*token), "/ \t\r");
+		if ((*token)[0] != '/') {
+			return vi;
+		}
+
+		// i/j/k
+		(*token)++;  // skip '/'
+		vi.vn_idx = fixIndex(atoi((*token)), vnsize);
+		(*token) += strcspn((*token), "/ \t\r");
+		return vi;
+	}
+
+	// Parse raw triples: i, i/j/k, i//k, i/j
+	static vertex_index parseRawTriple(const char **token) {
+		vertex_index vi(static_cast<int>(0));  // 0 is an invalid index in OBJ
+
+		vi.v_idx = atoi((*token));
+		(*token) += strcspn((*token), "/ \t\r");
+		if ((*token)[0] != '/') {
+			return vi;
+		}
+		(*token)++;
+
+		// i//k
+		if ((*token)[0] == '/') {
+			(*token)++;
+			vi.vn_idx = atoi((*token));
+			(*token) += strcspn((*token), "/ \t\r");
+			return vi;
+		}
+
+		// i/j/k or i/j
+		vi.vt_idx = atoi((*token));
+		(*token) += strcspn((*token), "/ \t\r");
+		if ((*token)[0] != '/') {
+			return vi;
+		}
+
+		// i/j/k
+		(*token)++;  // skip '/'
+		vi.vn_idx = atoi((*token));
+		(*token) += strcspn((*token), "/ \t\r");
+		return vi;
+	}
+
+	static void InitMaterial(material_t *material) {
+		material->name = "";
+		material->ambient_texname = "";
+		material->diffuse_texname = "";
+		material->specular_texname = "";
+		material->specular_highlight_texname = "";
+		material->bump_texname = "";
+		material->displacement_texname = "";
+		material->alpha_texname = "";
+		for (int i = 0; i < 3; i++) {
+			material->ambient[i] = 0.f;
+			material->diffuse[i] = 0.f;
+			material->specular[i] = 0.f;
+			material->transmittance[i] = 0.f;
+			material->emission[i] = 0.f;
+		}
+		material->illum = 0;
+		material->dissolve = 1.f;
+		material->shininess = 1.f;
+		material->ior = 1.f;
+
+		material->roughness = 0.f;
+		material->metallic = 0.f;
+		material->sheen = 0.f;
+		material->clearcoat_thickness = 0.f;
+		material->clearcoat_roughness = 0.f;
+		material->anisotropy_rotation = 0.f;
+		material->anisotropy = 0.f;
+		material->roughness_texname = "";
+		material->metallic_texname = "";
+		material->sheen_texname = "";
+		material->emissive_texname = "";
+		material->normal_texname = "";
+
+		material->unknown_parameter.clear();
+	}
+
+	static bool exportFaceGroupToShape(
+		shape_t *shape, const std::vector<std::vector<vertex_index> > &faceGroup,
+		const std::vector<tag_t> &tags, const int material_id,
+		const std::string &name, bool triangulate) {
+		if (faceGroup.empty()) {
+			return false;
+		}
+
+		// Flatten vertices and indices
+		for (size_t i = 0; i < faceGroup.size(); i++) {
+			const std::vector<vertex_index> &face = faceGroup[i];
+
+			vertex_index i0 = face[0];
+			vertex_index i1(-1);
+			vertex_index i2 = face[1];
+
+			size_t npolys = face.size();
+
+			if (triangulate) {
+				// Polygon -> triangle fan conversion
+				for (size_t k = 2; k < npolys; k++) {
+					i1 = i2;
+					i2 = face[k];
+
+					index_t idx0, idx1, idx2;
+					idx0.vertex_index = i0.v_idx;
+					idx0.normal_index = i0.vn_idx;
+					idx0.texcoord_index = i0.vt_idx;
+					idx1.vertex_index = i1.v_idx;
+					idx1.normal_index = i1.vn_idx;
+					idx1.texcoord_index = i1.vt_idx;
+					idx2.vertex_index = i2.v_idx;
+					idx2.normal_index = i2.vn_idx;
+					idx2.texcoord_index = i2.vt_idx;
+
+					shape->mesh.indices.push_back(idx0);
+					shape->mesh.indices.push_back(idx1);
+					shape->mesh.indices.push_back(idx2);
+
+					shape->mesh.num_face_vertices.push_back(3);
+					shape->mesh.material_ids.push_back(material_id);
+				}
+			}
+			else {
+				for (size_t k = 0; k < npolys; k++) {
+					index_t idx;
+					idx.vertex_index = face[k].v_idx;
+					idx.normal_index = face[k].vn_idx;
+					idx.texcoord_index = face[k].vt_idx;
+					shape->mesh.indices.push_back(idx);
+				}
+
+				shape->mesh.num_face_vertices.push_back(
+					static_cast<unsigned char>(npolys));
+				shape->mesh.material_ids.push_back(material_id);  // per face
+			}
+		}
+
+		shape->name = name;
+		shape->mesh.tags = tags;
+
+		return true;
+	}
+
+	void LoadMtl(std::map<std::string, int> *material_map,
+		std::vector<material_t> *materials, std::istream *inStream) {
+		// Create a default material anyway.
+		material_t material;
+		InitMaterial(&material);
+
+		while (inStream->peek() != -1) {
+			std::string linebuf;
+
+			safeGetline(*inStream, linebuf);
+
+			// Trim trailing whitespace.
+			if (linebuf.size() > 0) {
+				linebuf = linebuf.substr(0, linebuf.find_last_not_of(" \t") + 1);
+			}
+
+			// Trim newline '\r\n' or '\n'
+			if (linebuf.size() > 0) {
+				if (linebuf[linebuf.size() - 1] == '\n')
+					linebuf.erase(linebuf.size() - 1);
+			}
+			if (linebuf.size() > 0) {
+				if (linebuf[linebuf.size() - 1] == '\r')
+					linebuf.erase(linebuf.size() - 1);
+			}
+
+			// Skip if empty line.
+			if (linebuf.empty()) {
+				continue;
+			}
+
+			// Skip leading space.
+			const char *token = linebuf.c_str();
+			token += strspn(token, " \t");
+
+			assert(token);
+			if (token[0] == '\0') continue;  // empty line
+
+			if (token[0] == '#') continue;  // comment line
+
+			// new mtl
+			if ((0 == strncmp(token, "newmtl", 6)) && IS_SPACE((token[6]))) {
+				// flush previous material.
+				if (!material.name.empty()) {
+					material_map->insert(std::pair<std::string, int>(
+						material.name, static_cast<int>(materials->size())));
+					materials->push_back(material);
+				}
+
+				// initial temporary material
+				InitMaterial(&material);
+
+				// set new mtl name
+				char namebuf[TINYOBJ_SSCANF_BUFFER_SIZE];
+				token += 7;
+#ifdef _MSC_VER
+				sscanf_s(token, "%s", namebuf, (unsigned)_countof(namebuf));
+#else
+				sscanf(token, "%s", namebuf);
+#endif
+				material.name = namebuf;
+				continue;
+			}
+
+			// ambient
+			if (token[0] == 'K' && token[1] == 'a' && IS_SPACE((token[2]))) {
+				token += 2;
+				float r, g, b;
+				parseFloat3(&r, &g, &b, &token);
+				material.ambient[0] = r;
+				material.ambient[1] = g;
+				material.ambient[2] = b;
+				continue;
+			}
+
+			// diffuse
+			if (token[0] == 'K' && token[1] == 'd' && IS_SPACE((token[2]))) {
+				token += 2;
+				float r, g, b;
+				parseFloat3(&r, &g, &b, &token);
+				material.diffuse[0] = r;
+				material.diffuse[1] = g;
+				material.diffuse[2] = b;
+				continue;
+			}
+
+			// specular
+			if (token[0] == 'K' && token[1] == 's' && IS_SPACE((token[2]))) {
+				token += 2;
+				float r, g, b;
+				parseFloat3(&r, &g, &b, &token);
+				material.specular[0] = r;
+				material.specular[1] = g;
+				material.specular[2] = b;
+				continue;
+			}
+
+			// transmittance
+			if ((token[0] == 'K' && token[1] == 't' && IS_SPACE((token[2]))) ||
+				(token[0] == 'T' && token[1] == 'f' && IS_SPACE((token[2])))) {
+				token += 2;
+				float r, g, b;
+				parseFloat3(&r, &g, &b, &token);
+				material.transmittance[0] = r;
+				material.transmittance[1] = g;
+				material.transmittance[2] = b;
+				continue;
+			}
+
+			// ior(index of refraction)
+			if (token[0] == 'N' && token[1] == 'i' && IS_SPACE((token[2]))) {
+				token += 2;
+				material.ior = parseFloat(&token);
+				continue;
+			}
+
+			// emission
+			if (token[0] == 'K' && token[1] == 'e' && IS_SPACE(token[2])) {
+				token += 2;
+				float r, g, b;
+				parseFloat3(&r, &g, &b, &token);
+				material.emission[0] = r;
+				material.emission[1] = g;
+				material.emission[2] = b;
+				continue;
+			}
+
+			// shininess
+			if (token[0] == 'N' && token[1] == 's' && IS_SPACE(token[2])) {
+				token += 2;
+				material.shininess = parseFloat(&token);
+				continue;
+			}
+
+			// illum model
+			if (0 == strncmp(token, "illum", 5) && IS_SPACE(token[5])) {
+				token += 6;
+				material.illum = parseInt(&token);
+				continue;
+			}
+
+			// dissolve
+			if ((token[0] == 'd' && IS_SPACE(token[1]))) {
+				token += 1;
+				material.dissolve = parseFloat(&token);
+				continue;
+			}
+			if (token[0] == 'T' && token[1] == 'r' && IS_SPACE(token[2])) {
+				token += 2;
+				// Invert value of Tr(assume Tr is in range [0, 1])
+				material.dissolve = 1.0f - parseFloat(&token);
+				continue;
+			}
+
+			// PBR: roughness
+			if (token[0] == 'P' && token[1] == 'r' && IS_SPACE(token[2])) {
+				token += 2;
+				material.roughness = parseFloat(&token);
+				continue;
+			}
+
+			// PBR: metallic
+			if (token[0] == 'P' && token[1] == 'm' && IS_SPACE(token[2])) {
+				token += 2;
+				material.metallic = parseFloat(&token);
+				continue;
+			}
+
+			// PBR: sheen
+			if (token[0] == 'P' && token[1] == 's' && IS_SPACE(token[2])) {
+				token += 2;
+				material.sheen = parseFloat(&token);
+				continue;
+			}
+
+			// PBR: clearcoat thickness
+			if (token[0] == 'P' && token[1] == 'c' && IS_SPACE(token[2])) {
+				token += 2;
+				material.clearcoat_thickness = parseFloat(&token);
+				continue;
+			}
+
+			// PBR: clearcoat roughness
+			if ((0 == strncmp(token, "Pcr", 3)) && IS_SPACE(token[3])) {
+				token += 4;
+				material.clearcoat_roughness = parseFloat(&token);
+				continue;
+			}
+
+			// PBR: anisotropy
+			if ((0 == strncmp(token, "aniso", 5)) && IS_SPACE(token[5])) {
+				token += 6;
+				material.anisotropy = parseFloat(&token);
+				continue;
+			}
+
+			// PBR: anisotropy rotation
+			if ((0 == strncmp(token, "anisor", 6)) && IS_SPACE(token[6])) {
+				token += 7;
+				material.anisotropy_rotation = parseFloat(&token);
+				continue;
+			}
+
+			// ambient texture
+			if ((0 == strncmp(token, "map_Ka", 6)) && IS_SPACE(token[6])) {
+				token += 7;
+				material.ambient_texname = token;
+				continue;
+			}
+
+			// diffuse texture
+			if ((0 == strncmp(token, "map_Kd", 6)) && IS_SPACE(token[6])) {
+				token += 7;
+				material.diffuse_texname = token;
+				continue;
+			}
+
+			// specular texture
+			if ((0 == strncmp(token, "map_Ks", 6)) && IS_SPACE(token[6])) {
+				token += 7;
+				material.specular_texname = token;
+				continue;
+			}
+
+			// specular highlight texture
+			if ((0 == strncmp(token, "map_Ns", 6)) && IS_SPACE(token[6])) {
+				token += 7;
+				material.specular_highlight_texname = token;
+				continue;
+			}
+
+			// bump texture
+			if ((0 == strncmp(token, "map_bump", 8)) && IS_SPACE(token[8])) {
+				token += 9;
+				material.bump_texname = token;
+				continue;
+			}
+
+			// alpha texture
+			if ((0 == strncmp(token, "map_d", 5)) && IS_SPACE(token[5])) {
+				token += 6;
+				material.alpha_texname = token;
+				continue;
+			}
+
+			// bump texture
+			if ((0 == strncmp(token, "bump", 4)) && IS_SPACE(token[4])) {
+				token += 5;
+				material.bump_texname = token;
+				continue;
+			}
+
+			// displacement texture
+			if ((0 == strncmp(token, "disp", 4)) && IS_SPACE(token[4])) {
+				token += 5;
+				material.displacement_texname = token;
+				continue;
+			}
+
+			// PBR: roughness texture
+			if ((0 == strncmp(token, "map_Pr", 6)) && IS_SPACE(token[6])) {
+				token += 7;
+				material.roughness_texname = token;
+				continue;
+			}
+
+			// PBR: metallic texture
+			if ((0 == strncmp(token, "map_Pm", 6)) && IS_SPACE(token[6])) {
+				token += 7;
+				material.metallic_texname = token;
+				continue;
+			}
+
+			// PBR: sheen texture
+			if ((0 == strncmp(token, "map_Ps", 6)) && IS_SPACE(token[6])) {
+				token += 7;
+				material.sheen_texname = token;
+				continue;
+			}
+
+			// PBR: emissive texture
+			if ((0 == strncmp(token, "map_Ke", 6)) && IS_SPACE(token[6])) {
+				token += 7;
+				material.emissive_texname = token;
+				continue;
+			}
+
+			// PBR: normal map texture
+			if ((0 == strncmp(token, "norm", 4)) && IS_SPACE(token[4])) {
+				token += 5;
+				material.normal_texname = token;
+				continue;
+			}
+
+			// unknown parameter
+			const char *_space = strchr(token, ' ');
+			if (!_space) {
+				_space = strchr(token, '\t');
+			}
+			if (_space) {
+				std::ptrdiff_t len = _space - token;
+				std::string key(token, static_cast<size_t>(len));
+				std::string value = _space + 1;
+				material.unknown_parameter.insert(
+					std::pair<std::string, std::string>(key, value));
+			}
+		}
+		// flush last material.
+		material_map->insert(std::pair<std::string, int>(
+			material.name, static_cast<int>(materials->size())));
+		materials->push_back(material);
+	}
+
+	bool MaterialFileReader::operator()(const std::string &matId,
+		std::vector<material_t> *materials,
+		std::map<std::string, int> *matMap,
+		std::string *err) {
+		std::string filepath;
+
+		if (!m_mtlBasePath.empty()) {
+			filepath = std::string(m_mtlBasePath) + matId;
+		}
+		else {
+			filepath = matId;
+		}
+
+		std::ifstream matIStream(filepath.c_str());
+		LoadMtl(matMap, materials, &matIStream);
+		if (!matIStream) {
+			std::stringstream ss;
+			ss << "WARN: Material file [ " << filepath
+				<< " ] not found. Created a default material.";
+			if (err) {
+				(*err) += ss.str();
+			}
+		}
+		return true;
+	}
+
+	bool MaterialStreamReader::operator()(const std::string &matId,
+		std::vector<material_t> *materials,
+		std::map<std::string, int> *matMap,
+		std::string *err) {
+		LoadMtl(matMap, materials, &m_inStream);
+		if (!m_inStream) {
+			std::stringstream ss;
+			ss << "WARN: Material stream in error state."
+				<< " Created a default material.";
+			if (err) {
+				(*err) += ss.str();
+			}
+		}
+		return true;
+	}
+
+	bool LoadObj(attrib_t *attrib, std::vector<shape_t> *shapes,
+		std::vector<material_t> *materials, std::string *err,
+		const char *filename, const char *mtl_basepath,
+		bool trianglulate) {
+		attrib->vertices.clear();
+		attrib->normals.clear();
+		attrib->texcoords.clear();
+		shapes->clear();
+
+		std::stringstream errss;
+
+		std::ifstream ifs(filename);
+		if (!ifs) {
+			errss << "Cannot open file [" << filename << "]" << std::endl;
+			if (err) {
+				(*err) = errss.str();
+			}
+			return false;
+		}
+
+		std::string basePath;
+		if (mtl_basepath) {
+			basePath = mtl_basepath;
+		}
+		MaterialFileReader matFileReader(basePath);
+
+		return LoadObj(attrib, shapes, materials, err, &ifs, &matFileReader,
+			trianglulate);
+	}
+
+	bool LoadObj(attrib_t *attrib, std::vector<shape_t> *shapes,
+		std::vector<material_t> *materials, std::string *err,
+		std::istream *inStream,
+		MaterialReader *readMatFn /*= NULL*/,
+		bool triangulate) {
+		std::stringstream errss;
+
+		std::vector<float> v;
+		std::vector<float> vn;
+		std::vector<float> vt;
+		std::vector<tag_t> tags;
+		std::vector<std::vector<vertex_index> > faceGroup;
+		std::string name;
+
+		// material
+		std::map<std::string, int> material_map;
+		int material = -1;
+
+		shape_t shape;
+
+		while (inStream->peek() != -1) {
+			std::string linebuf;
+			safeGetline(*inStream, linebuf);
+
+			// Trim newline '\r\n' or '\n'
+			if (linebuf.size() > 0) {
+				if (linebuf[linebuf.size() - 1] == '\n')
+					linebuf.erase(linebuf.size() - 1);
+			}
+			if (linebuf.size() > 0) {
+				if (linebuf[linebuf.size() - 1] == '\r')
+					linebuf.erase(linebuf.size() - 1);
+			}
+
+			// Skip if empty line.
+			if (linebuf.empty()) {
+				continue;
+			}
+
+			// Skip leading space.
+			const char *token = linebuf.c_str();
+			token += strspn(token, " \t");
+
+			assert(token);
+			if (token[0] == '\0') continue;  // empty line
+
+			if (token[0] == '#') continue;  // comment line
+
+			// vertex
+			if (token[0] == 'v' && IS_SPACE((token[1]))) {
+				token += 2;
+				float x, y, z;
+				parseFloat3(&x, &y, &z, &token);
+				v.push_back(x);
+				v.push_back(y);
+				v.push_back(z);
+				continue;
+			}
+
+			// normal
+			if (token[0] == 'v' && token[1] == 'n' && IS_SPACE((token[2]))) {
+				token += 3;
+				float x, y, z;
+				parseFloat3(&x, &y, &z, &token);
+				vn.push_back(x);
+				vn.push_back(y);
+				vn.push_back(z);
+				continue;
+			}
+
+			// texcoord
+			if (token[0] == 'v' && token[1] == 't' && IS_SPACE((token[2]))) {
+				token += 3;
+				float x, y;
+				parseFloat2(&x, &y, &token);
+				vt.push_back(x);
+				vt.push_back(y);
+				continue;
+			}
+
+			// face
+			if (token[0] == 'f' && IS_SPACE((token[1]))) {
+				token += 2;
+				token += strspn(token, " \t");
+
+				std::vector<vertex_index> face;
+				face.reserve(3);
+
+				while (!IS_NEW_LINE(token[0])) {
+					vertex_index vi = parseTriple(&token, static_cast<int>(v.size() / 3),
+						static_cast<int>(vn.size() / 3),
+						static_cast<int>(vt.size() / 2));
+					face.push_back(vi);
+					size_t n = strspn(token, " \t\r");
+					token += n;
+				}
+
+				// replace with emplace_back + std::move on C++11
+				faceGroup.push_back(std::vector<vertex_index>());
+				faceGroup[faceGroup.size() - 1].swap(face);
+
+				continue;
+			}
+
+			// use mtl
+			if ((0 == strncmp(token, "usemtl", 6)) && IS_SPACE((token[6]))) {
+				char namebuf[TINYOBJ_SSCANF_BUFFER_SIZE];
+				token += 7;
+#ifdef _MSC_VER
+				sscanf_s(token, "%s", namebuf, (unsigned)_countof(namebuf));
+#else
+				sscanf(token, "%s", namebuf);
+#endif
+
+				int newMaterialId = -1;
+				if (material_map.find(namebuf) != material_map.end()) {
+					newMaterialId = material_map[namebuf];
+				}
+				else {
+					// { error!! material not found }
+				}
+
+				if (newMaterialId != material) {
+					// Create per-face material
+					exportFaceGroupToShape(&shape, faceGroup, tags, material, name,
+						triangulate);
+					faceGroup.clear();
+					material = newMaterialId;
+				}
+
+				continue;
+			}
+
+			// load mtl
+			if ((0 == strncmp(token, "mtllib", 6)) && IS_SPACE((token[6]))) {
+				if (readMatFn) {
+					char namebuf[TINYOBJ_SSCANF_BUFFER_SIZE];
+					token += 7;
+#ifdef _MSC_VER
+					sscanf_s(token, "%s", namebuf, (unsigned)_countof(namebuf));
+#else
+					sscanf(token, "%s", namebuf);
+#endif
+
+					std::string err_mtl;
+					bool ok = (*readMatFn)(namebuf, materials, &material_map, &err_mtl);
+					if (err) {
+						(*err) += err_mtl;
+					}
+
+					if (!ok) {
+						faceGroup.clear();  // for safety
+						return false;
+					}
+				}
+
+				continue;
+			}
+
+			// group name
+			if (token[0] == 'g' && IS_SPACE((token[1]))) {
+				// flush previous face group.
+				bool ret = exportFaceGroupToShape(&shape, faceGroup, tags, material, name,
+					triangulate);
+				if (ret) {
+					shapes->push_back(shape);
+				}
+
+				shape = shape_t();
+
+				// material = -1;
+				faceGroup.clear();
+
+				std::vector<std::string> names;
+				names.reserve(2);
+
+				while (!IS_NEW_LINE(token[0])) {
+					std::string str = parseString(&token);
+					names.push_back(str);
+					token += strspn(token, " \t\r");  // skip tag
+				}
+
+				assert(names.size() > 0);
+
+				// names[0] must be 'g', so skip the 0th element.
+				if (names.size() > 1) {
+					name = names[1];
+				}
+				else {
+					name = "";
+				}
+
+				continue;
+			}
+
+			// object name
+			if (token[0] == 'o' && IS_SPACE((token[1]))) {
+				// flush previous face group.
+				bool ret = exportFaceGroupToShape(&shape, faceGroup, tags, material, name,
+					triangulate);
+				if (ret) {
+					shapes->push_back(shape);
+				}
+
+				// material = -1;
+				faceGroup.clear();
+				shape = shape_t();
+
+				// @todo { multiple object name? }
+				char namebuf[TINYOBJ_SSCANF_BUFFER_SIZE];
+				token += 2;
+#ifdef _MSC_VER
+				sscanf_s(token, "%s", namebuf, (unsigned)_countof(namebuf));
+#else
+				sscanf(token, "%s", namebuf);
+#endif
+				name = std::string(namebuf);
+
+				continue;
+			}
+
+			if (token[0] == 't' && IS_SPACE(token[1])) {
+				tag_t tag;
+
+				char namebuf[4096];
+				token += 2;
+#ifdef _MSC_VER
+				sscanf_s(token, "%s", namebuf, (unsigned)_countof(namebuf));
+#else
+				sscanf(token, "%s", namebuf);
+#endif
+				tag.name = std::string(namebuf);
+
+				token += tag.name.size() + 1;
+
+				tag_sizes ts = parseTagTriple(&token);
+
+				tag.intValues.resize(static_cast<size_t>(ts.num_ints));
+
+				for (size_t i = 0; i < static_cast<size_t>(ts.num_ints); ++i) {
+					tag.intValues[i] = atoi(token);
+					token += strcspn(token, "/ \t\r") + 1;
+				}
+
+				tag.floatValues.resize(static_cast<size_t>(ts.num_floats));
+				for (size_t i = 0; i < static_cast<size_t>(ts.num_floats); ++i) {
+					tag.floatValues[i] = parseFloat(&token);
+					token += strcspn(token, "/ \t\r") + 1;
+				}
+
+				tag.stringValues.resize(static_cast<size_t>(ts.num_strings));
+				for (size_t i = 0; i < static_cast<size_t>(ts.num_strings); ++i) {
+					char stringValueBuffer[4096];
+
+#ifdef _MSC_VER
+					sscanf_s(token, "%s", stringValueBuffer,
+						(unsigned)_countof(stringValueBuffer));
+#else
+					sscanf(token, "%s", stringValueBuffer);
+#endif
+					tag.stringValues[i] = stringValueBuffer;
+					token += tag.stringValues[i].size() + 1;
+				}
+
+				tags.push_back(tag);
+			}
+
+			// Ignore unknown command.
+		}
+
+		bool ret = exportFaceGroupToShape(&shape, faceGroup, tags, material, name,
+			triangulate);
+		if (ret) {
+			shapes->push_back(shape);
+		}
+		faceGroup.clear();  // for safety
+
+		if (err) {
+			(*err) += errss.str();
+		}
+
+		attrib->vertices.swap(v);
+		attrib->normals.swap(vn);
+		attrib->texcoords.swap(vt);
+
+		return true;
+	}
+
+	bool LoadObjWithCallback(std::istream &inStream, const callback_t &callback,
+		void *user_data /*= NULL*/,
+		MaterialReader *readMatFn /*= NULL*/,
+		std::string *err /*= NULL*/) {
+		std::stringstream errss;
+
+		// material
+		std::map<std::string, int> material_map;
+		int material_id = -1;  // -1 = invalid
+
+		std::vector<index_t> indices;
+		std::vector<material_t> materials;
+		std::vector<std::string> names;
+		names.reserve(2);
+		std::string name;
+		std::vector<const char *> names_out;
+
+		std::string linebuf;
+		while (inStream.peek() != -1) {
+			safeGetline(inStream, linebuf);
+
+			// Trim newline '\r\n' or '\n'
+			if (linebuf.size() > 0) {
+				if (linebuf[linebuf.size() - 1] == '\n')
+					linebuf.erase(linebuf.size() - 1);
+			}
+			if (linebuf.size() > 0) {
+				if (linebuf[linebuf.size() - 1] == '\r')
+					linebuf.erase(linebuf.size() - 1);
+			}
+
+			// Skip if empty line.
+			if (linebuf.empty()) {
+				continue;
+			}
+
+			// Skip leading space.
+			const char *token = linebuf.c_str();
+			token += strspn(token, " \t");
+
+			assert(token);
+			if (token[0] == '\0') continue;  // empty line
+
+			if (token[0] == '#') continue;  // comment line
+
+			// vertex
+			if (token[0] == 'v' && IS_SPACE((token[1]))) {
+				token += 2;
+				float x, y, z, w;  // w is optional. default = 1.0
+				parseV(&x, &y, &z, &w, &token);
+				if (callback.vertex_cb) {
+					callback.vertex_cb(user_data, x, y, z, w);
+				}
+				continue;
+			}
+
+			// normal
+			if (token[0] == 'v' && token[1] == 'n' && IS_SPACE((token[2]))) {
+				token += 3;
+				float x, y, z;
+				parseFloat3(&x, &y, &z, &token);
+				if (callback.normal_cb) {
+					callback.normal_cb(user_data, x, y, z);
+				}
+				continue;
+			}
+
+			// texcoord
+			if (token[0] == 'v' && token[1] == 't' && IS_SPACE((token[2]))) {
+				token += 3;
+				float x, y, z;  // y and z are optional. default = 0.0
+				parseFloat3(&x, &y, &z, &token);
+				if (callback.texcoord_cb) {
+					callback.texcoord_cb(user_data, x, y, z);
+				}
+				continue;
+			}
+
+			// face
+			if (token[0] == 'f' && IS_SPACE((token[1]))) {
+				token += 2;
+				token += strspn(token, " \t");
+
+				indices.clear();
+				while (!IS_NEW_LINE(token[0])) {
+					vertex_index vi = parseRawTriple(&token);
+
+					index_t idx;
+					idx.vertex_index = vi.v_idx;
+					idx.normal_index = vi.vn_idx;
+					idx.texcoord_index = vi.vt_idx;
+
+					indices.push_back(idx);
+					size_t n = strspn(token, " \t\r");
+					token += n;
+				}
+
+				if (callback.index_cb && indices.size() > 0) {
+					callback.index_cb(user_data, &indices.at(0),
+						static_cast<int>(indices.size()));
+				}
+
+				continue;
+			}
+
+			// use mtl
+			if ((0 == strncmp(token, "usemtl", 6)) && IS_SPACE((token[6]))) {
+				char namebuf[TINYOBJ_SSCANF_BUFFER_SIZE];
+				token += 7;
+#ifdef _MSC_VER
+				sscanf_s(token, "%s", namebuf,
+					static_cast<unsigned int>(_countof(namebuf)));
+#else
+				sscanf(token, "%s", namebuf);
+#endif
+
+				int newMaterialId = -1;
+				if (material_map.find(namebuf) != material_map.end()) {
+					newMaterialId = material_map[namebuf];
+				}
+				else {
+					// { error!! material not found }
+				}
+
+				if (newMaterialId != material_id) {
+					material_id = newMaterialId;
+				}
+
+				if (callback.usemtl_cb) {
+					callback.usemtl_cb(user_data, namebuf, material_id);
+				}
+
+				continue;
+			}
+
+			// load mtl
+			if ((0 == strncmp(token, "mtllib", 6)) && IS_SPACE((token[6]))) {
+				if (readMatFn) {
+					char namebuf[TINYOBJ_SSCANF_BUFFER_SIZE];
+					token += 7;
+#ifdef _MSC_VER
+					sscanf_s(token, "%s", namebuf, (unsigned)_countof(namebuf));
+#else
+					sscanf(token, "%s", namebuf);
+#endif
+
+					std::string err_mtl;
+					materials.clear();
+					bool ok = (*readMatFn)(namebuf, &materials, &material_map, &err_mtl);
+					if (err) {
+						(*err) += err_mtl;
+					}
+
+					if (!ok) {
+						return false;
+					}
+
+					if (callback.mtllib_cb) {
+						callback.mtllib_cb(user_data, &materials.at(0),
+							static_cast<int>(materials.size()));
+					}
+				}
+
+				continue;
+			}
+
+			// group name
+			if (token[0] == 'g' && IS_SPACE((token[1]))) {
+				names.clear();
+
+				while (!IS_NEW_LINE(token[0])) {
+					std::string str = parseString(&token);
+					names.push_back(str);
+					token += strspn(token, " \t\r");  // skip tag
+				}
+
+				assert(names.size() > 0);
+
+				// names[0] must be 'g', so skip the 0th element.
+				if (names.size() > 1) {
+					name = names[1];
+				}
+				else {
+					name.clear();
+				}
+
+				if (callback.group_cb) {
+					if (names.size() > 1) {
+						// create const char* array.
+						names_out.resize(names.size() - 1);
+						for (size_t j = 0; j < names_out.size(); j++) {
+							names_out[j] = names[j + 1].c_str();
+						}
+						callback.group_cb(user_data, &names_out.at(0),
+							static_cast<int>(names_out.size()));
+
+					}
+					else {
+						callback.group_cb(user_data, NULL, 0);
+					}
+				}
+
+				continue;
+			}
+
+			// object name
+			if (token[0] == 'o' && IS_SPACE((token[1]))) {
+				// @todo { multiple object name? }
+				char namebuf[TINYOBJ_SSCANF_BUFFER_SIZE];
+				token += 2;
+#ifdef _MSC_VER
+				sscanf_s(token, "%s", namebuf, (unsigned)_countof(namebuf));
+#else
+				sscanf(token, "%s", namebuf);
+#endif
+				std::string object_name = std::string(namebuf);
+
+				if (callback.object_cb) {
+					callback.object_cb(user_data, object_name.c_str());
+				}
+
+				continue;
+			}
+
+#if 0  // @todo
+			if (token[0] == 't' && IS_SPACE(token[1])) {
+				tag_t tag;
+
+				char namebuf[4096];
+				token += 2;
+#ifdef _MSC_VER
+				sscanf_s(token, "%s", namebuf, (unsigned)_countof(namebuf));
+#else
+				sscanf(token, "%s", namebuf);
+#endif
+				tag.name = std::string(namebuf);
+
+				token += tag.name.size() + 1;
+
+				tag_sizes ts = parseTagTriple(&token);
+
+				tag.intValues.resize(static_cast<size_t>(ts.num_ints));
+
+				for (size_t i = 0; i < static_cast<size_t>(ts.num_ints); ++i) {
+					tag.intValues[i] = atoi(token);
+					token += strcspn(token, "/ \t\r") + 1;
+				}
+
+				tag.floatValues.resize(static_cast<size_t>(ts.num_floats));
+				for (size_t i = 0; i < static_cast<size_t>(ts.num_floats); ++i) {
+					tag.floatValues[i] = parseFloat(&token);
+					token += strcspn(token, "/ \t\r") + 1;
+				}
+
+				tag.stringValues.resize(static_cast<size_t>(ts.num_strings));
+				for (size_t i = 0; i < static_cast<size_t>(ts.num_strings); ++i) {
+					char stringValueBuffer[4096];
+
+#ifdef _MSC_VER
+					sscanf_s(token, "%s", stringValueBuffer,
+						(unsigned)_countof(stringValueBuffer));
+#else
+					sscanf(token, "%s", stringValueBuffer);
+#endif
+					tag.stringValues[i] = stringValueBuffer;
+					token += tag.stringValues[i].size() + 1;
+				}
+
+				tags.push_back(tag);
+			}
+#endif
+
+			// Ignore unknown command.
+		}
+
+		if (err) {
+			(*err) += errss.str();
+		}
+
+		return true;
+	}
+}  // namespace tinyobj
+
+#endif
+
+#endif  // TINY_OBJ_LOADER_H_
\ No newline at end of file
diff --git a/src/utilities.cpp b/src/utilities.cpp
index 9c06c68..e3fea00 100644
--- a/src/utilities.cpp
+++ b/src/utilities.cpp
@@ -110,3 +110,24 @@ std::istream& utilityCore::safeGetline(std::istream& is, std::string& t) {
         }
     }
 }
+
+glm::vec3 utilityCore::vecMin(glm::vec3 a, glm::vec3 b) {
+  glm::vec3 ret;
+  ret.x = glm::min(a.x, b.x);
+  ret.y = glm::min(a.y, b.y);
+  ret.z = glm::min(a.z, b.z);
+  return ret;
+}
+
+glm::vec3 utilityCore::vecMax(glm::vec3 a, glm::vec3 b) {
+  glm::vec3 ret;
+  ret.x = glm::max(a.x, b.x);
+  ret.y = glm::max(a.y, b.y);
+  ret.z = glm::max(a.z, b.z);
+  return ret;
+}
+
+bool utilityCore::aabbIntersect(glm::vec3 aMin, glm::vec3 aMax, glm::vec3 bMin, glm::vec3 bMax) {
+  return (aMin.x <= bMax.x && aMax.x >= bMin.x) && (aMin.y <= bMax.y && aMax.y >= bMin.y) &&
+    (aMin.z <= bMax.z && aMax.z >= bMin.z);
+}
\ No newline at end of file
diff --git a/src/utilities.h b/src/utilities.h
index abb4f27..d32f8dc 100644
--- a/src/utilities.h
+++ b/src/utilities.h
@@ -23,4 +23,8 @@ namespace utilityCore {
     extern glm::mat4 buildTransformationMatrix(glm::vec3 translation, glm::vec3 rotation, glm::vec3 scale);
     extern std::string convertIntToString(int number);
     extern std::istream& safeGetline(std::istream& is, std::string& t); //Thanks to http://stackoverflow.com/a/6089413
+    
+    extern glm::vec3 vecMin(glm::vec3 a, glm::vec3 b);
+    extern glm::vec3 vecMax(glm::vec3 a, glm::vec3 b);
+    extern bool aabbIntersect(glm::vec3 aMin, glm::vec3 aMax, glm::vec3 bMin, glm::vec3 bMax);
 }