This was my final project for my Master’s in Electrical Engineering at the University of Washington, completed in 2024. I worked with a group of 6 graduate students to implement a parallel QR decomposition algorithm using CUDA. The project was a collaboration with Amazon Lab126 for development of Simultaneous Localization and Mapping (SLAM) for robotics. The QR decomposition can be used for least-squares optimization to solve for robot pose, a key part of SLAM algorithms.
We developed our algorithm targetting a PC with a GeForce RTX 2070 and Ubuntu installed. The algorithm was developed “from scratch” without the use of BLAS libraries (e.g. Cutlass). Here’s a clip from one of our presentations, where I explain the mathematics.