--- /dev/null
+Class Project for CSC-718: Operating Systems and Parallel Computing
+
+ Particle Simulator
+ Brendan Hansen
+
+DESCRIPTION
+ This project started with the goal of simulating the N-body problem.
+ This was a rather easy task, and I completed it with enough time to
+ spare, that I decided to add on to the problem statement. I was
+ inspired by a YouTube video I saw, in which a particle simulator with
+ a specific set of rules was able to produce interesting and emergent
+ behaviour, similar to Conway's Game of Life [1]. When I saw this video
+ I was interested in creating my own solution using the backbone I
+ already made as a result of this project.
+
+ This project is my implementation of the idea from Reference [1]. I
+ began by creating a single threaded solution, followed by a POSIX
+ threads solution, and finally an OpenMP solution. I chose these
+ libraries because I knew I needed a shared memory model and to have
+ a visual of the simulation. With these libraries, I could easily use
+ OpenGL and GLFW to create the visuals.
+
+ The particle simulator functions as follows:
+ - Initially, particles are randomly placed in a finite space, with
+ a random "kind", denoted by their color.
+ - Every kind of particle has a set of parameters relating it to the
+ other kinds of particles. These rules describe a function
+ implemented in src/physics.cpp.
+ - These parameters do not describe a symmetric relation; Meaning
+ particle A may be attracted to particle B, but particle B may
+ be repelled by particle A. This results in a "chase", where A
+ is going towards B, but B is running away from A. This breaks
+ Newton's third law of motion, which in turn breaks the law of
+ conservation of energy, meaning that this "chase" would result
+ in the particles gaining infinite speed.
+ - To combat this, every particle has the same amount of friction
+ applied to it, dependent on its velocity.
+
+ Every interesting interaction that happens is just a result of the
+ simple physics equations.
+
+ To make the solution faster, I noted that the force formula between
+ any two kinds of particles has a maximum value. This means that of the
+ N bodies any body need to consider, only a portion of them are actually
+ close enough to matter. Because of this, I employed a Quad Tree [2].
+ This let me turn the physics calculation from O(n^2) to O(n*log n).
+ However, there is an extra step that needs to be taken before each physics
+ step: the Quad Tree must be recomputed. This takes O(n) time so the total
+ physics time is still O(n*log n), which is better than O(n^2). When the
+ particles are very bunched up, the Quad Tree does actually slow down the
+ implementation, because it becomes closer to O(n*sqrt(n)). However,
+ in much of my testing with random parameters, the particles generally
+ spread out so the Quad Tree does help the implementation.
+
+ My methodology for improving the performance with parallel computing was
+ as follows:
+ - Break the bodies into blocks that each thread will update.
+ - One of the threads needs to update the quad tree. This can
+ only be done on a single thread, because the creation of the
+ quad tree is not parallizable.
+ - The master thread will still need to handle the drawing (this
+ is required by GLFW).
+
+ So I took the sequence of tasks that looked like this:
+ 1. Handle user input
+ 2. Update quad tree
+ 3. Calculate all body movements
+ 4. Move all bodies
+ 5. Draw the scene
+
+ To this:
+ Main Thread Worker threads
+ 1. Handle user input 1. Wait
+ 2. Calculate some 2. Calculate some
+ body movements body movements
+ 3. Move some bodies 3. Move some bodies
+ 4. Draw the scene 4. Update quad tree (ONE worker)
+
+ Notice two things. The calculating of body movements and moving the
+ bodies, has been block divided across all the workers and the main
+ thread. Also, the updating of the quad tree now happens during the
+ drawing on the main thread. This is because the quad tree just needs
+ to know the positions of all the bodies, which we do after we update
+ them. For this reason, in the POSIX Thread version, the thread_count
+ variable must be at least 2, otherwise the quad tree is never updated.
+
+ The OpenMP version is very similar, but it does not have the issue
+ of thread_count needing to be at least 2.
+
+ Also, you may notice that the code is in C++, but does not use
+ any of the standard C++ libraries. This is because I wanted to have
+ complete control over all of the code and did not want to use these
+ heavy weight template libraries. For this reason, much of the code
+ is more like C with templates and member functions.
+
+DEPENDENCIES
+ To compile every version, you need:
+ - A Linux system
+ - g++
+ - openmp
+ - pthreads
+ - glfw (>= 3.0.0)
+ - OpenGL ES 3.0
+
+COMPILING
+ If everything is installed correctly, you should just have to run:
+ $ make clean all
+
+ This will ensure that all the old object files are deleted and you
+ have a clean build of all versions of the program.
+
+RUNNING
+ There are three executables made by the Makefile: sim, sim_seq, sim_omp.
+ - sim is the pthread version of the program.
+ - sim_seq is the sequential version of the program.
+ - sim_omp is the openmp version of the program.
+
+ All of them can be run on their own, i.e. ./sim. However, they can all
+ take a settings file as their only argument:
+ $ ./sim <settings_file>
+
+ The description of the settings file can be found in explain.settings.
+ It also just happens to be a valid settings file.
+
+ If no settings file is present, then it defaults to random settings,
+ which will automatically be saved to "generated.settings" in the current
+ directory.
+
+USING THE SOFTWARE
+ The simulation will start right away when the program launches.
+ Here are a list of the bound keys:
+ up, down, left, and right: pan around the simulation.
+ q: zoom in
+ a: zoom out
+ f: toggle fullscreen
+ escape: quit the program
+
+ I realize I did not have a chance to add mouse support because I never
+ wanted it and I have run out of time.
+
+PERFORMANCE MEASUREMENTS
+ I benchmarked the various versions of the program on my Thinkpad X1 Carbon
+ with the following specifications:
+ - i7-7500U @3.5GHz
+ - 16GB LDDR3 RAM
+ - NVMe 256GB SSD
+
+ Each cell contains the average FPS when run on speed_test.settings, with
+ the various modifications to thread count and particle count. Note, that
+ GLFW was set to be frame limited by the vertical synchronization of the
+ display, which means the highest possible FPS is 60.
+
+ Sequential version:
+ Thread count
+ Particle Count 1 2 3 4
+ 1000 60 - - -
+ 2000 60 - - -
+ 2500 53 - - -
+ 3000 38 - - -
+ 3500 30 - - -
+ 4000 23 - - -
+ 5000 15 - - -
+
+ PThread version:
+ Thread count
+ Particle Count 1 2 3 4
+ 1000 - 60 60 60
+ 2000 - 60 60 60
+ 2500 - 60 60 60
+ 3000 - 60 60 60
+ 3500 - 48 60 60
+ 4000 - 40 48 60
+ 5000 - 25 30 41
+
+ OpenMP version:
+ Thread count
+ Particle Count 1 2 3 4
+ 1000 60 60 60 60
+ 2000 60 60 60 60
+ 2500 51 60 60 60
+ 3000 39 60 60 60
+ 3500 30 50 60 60
+ 4000 23 45 48 60
+ 5000 15 28 31 42
+
+ANALYSIS OF PERFORMANCE
+ It was expected and experimentally validated that the performance
+ of the program increases as the number of threads increases. This
+ does not happen in a linear fashion as the way the program is
+ made, some parts are not amendable to parallel programming. I am
+ happy with the performance of the program, as it can simulate
+ thousands of particles on my laptop, while maintaining 60 frames
+ per second.
+
+ANALYSIS OF PROGRAMMING ENJOYMENT
+ Although OpenMP boasts being easy to quickly convert a sequential
+ program into a parallel one, it does not lend itself to an easy
+ transition in all circumstances, especially the one I had. This
+ program was rather complicated and I ended up having to refactor
+ much of the main loop in order to accommodate the OpenMP solution.
+ I found it much easier to think about the PThread solution as I
+ could easily see EXACTLY how the program was parallelized. I know
+ OpenMP has its benefits, but I did not experience them on this
+ project.
+
+REFERENCES
+ [1] "Particle Life - A Game of Life Made of Particles"
+ https://www.youtube.com/watch?v=Z_zmZ23grXE
+
+ [2] "Quadtree"
+ https://en.wikipedia.org/wiki/Quadtree