Submission information
Submission Number: 185
Submission ID: 4159
Submission UUID: f0fe8f07-765e-4aef-a2be-593874f1840e
Submission URI: /form/project
Created: Mon, 10/09/2023 - 14:28
Completed: Mon, 10/09/2023 - 14:28
Changed: Mon, 05/20/2024 - 10:35
Remote IP address: 199.223.241.7
Submitted by: John Moustakas
Language: English
Is draft: No
Webform: Project
Project Title | Ultrafast Spectral Energy Distribution Modeling of Galaxies using GPUs |
---|---|
Program | Campus Champions, CAREERS |
Project Image | |
Tags | optimization (509), parallelization (223), astrophysics (297), gpu (80), python (69) |
Status | Complete |
Project Leader | John Moustakas |
jmoustakas@siena.edu | |
Mobile Phone | 518-390-4012 |
Work Phone | 518-783-4274 |
Mentor(s) | |
Student-facilitator(s) | Samyak Tuladhar |
Mentee(s) | |
Project Description | One of the most important outstanding problems in observational and theoretical astrophysics is to understand the physical origin and evolution of galaxies. Galaxies are gravitationally-bound systems consisting of tens to hundreds of billions of stars, gas, and dust, as well as large amounts of dark matter, which we observe across the entire 14 billion-year history of the universe. Fortunately, sophisticated models exist which allow us to interpret the observed spectral energy distributions of galaxies---in essence, how bright they appear in different parts of the electromagnetic spectrum, particularly in the ultraviolet, optical, and infrared---in terms of their physical properties such as stellar mass and star-formation rate. For example, the stellar mass of a galaxy reveals how efficiently gas has been converted into stars over the evolutionary history of the galaxy, while the star-formation rate indicates the current rate at which new stars are being born, or whether star formation has ceased entirely. Not surprisingly, the parameter likelihood space which must be explored in order to effectively model observations of galaxies can be very large. In addition, the latest generation of massively multiplexed astrophysical surveys such as the Dark Energy Spectroscopic Instrument (DESI) survey are observing samples of tens of millions of galaxies. Consequently, there is an acute need for massively parallelized, computationally efficient code which can extract astrophysically meaningful constraints from large observational datasets of galaxies. The open-source Python software package needed to carry out this project is called FastSpecFit (https://fastspecfit.readthedocs.org/en/latest). The code is reasonably well-documented and it has already been run on a high-performance computing system on samples of millions of galaxies observed by DESI. There are two computational bottlenecks, however, which are hampering being able to deploy FastSpecFit at the next scale, both in terms of input sample size and complexity of the underlying astrophysical models. These bottlenecks involve non-negative least-squares (NNLS) and non-linear least-squares fitting, both of which are currently being done using the CPU-optimized SciPy library. With these issues in mind, the goal of this project is to port the computational "heart" of FastSpecFit to GPUs. We propose using JAX (https://jax.readthedocs.io/en/latest), which uses automatic (or computational) differentiation for optimization. Specifically, the open-source project JAXopt (https://jaxopt.github.io/stable) includes well-tested algorithms for solving a wide range of both linear and non-linear constrained optimization problems using GPU-accelerated, automatic differentiation. After testing these algorithms using simple (simulated) datasets, we will then implement an optional GPU version of FastSpecFit, and ultimately test it on actual DESI data. |
Project Deliverables | This project includes three major deliverables: 1. Documentation which clearly describes how all software products and their dependencies (particularly JAX and JAXopt) should be installed and run, both with and without GPUs. 2. Executable, well-documented code which solves both simulated and real-data bounded non-linear least-squares problems. 3. Comparisons (via benchmarking runs) of existing CPU (e.g., scipy.optimize) and GPU/JAX implementations of the identical problems. |
Project Deliverables | |
Student Research Computing Facilitator Profile | Samyak (Sam) Tuladhar (sd10tula@siena.edu) is a sophomore undergraduate physics major at Siena College and he has both the interest and technical background needed to undertake this project. |
Mentee Research Computing Profile | |
Student Facilitator Programming Skill Level | Some hands-on experience |
Mentee Programming Skill Level | |
Project Institution | Siena College |
Project Address | Department of Physics and Astronomy 515 Loudon Rd Loudonville, New York. 12211 |
Anchor Institution | CR-Rensselaer Polytechnic Institute |
Preferred Start Date | 12/01/2023 |
Start as soon as possible. | No |
Project Urgency | Already behind3Start date is flexible |
Expected Project Duration (in months) | 6 |
Launch Presentation | |
Launch Presentation Date | 01/05/2024 |
Wrap Presentation | |
Wrap Presentation Date | 05/17/2024 |
Project Milestones |
|
Github Contributions | https://github.com/Samyak-DT/FasterSpecFit |
Planned Portal Contributions (if any) | |
Planned Publications (if any) | If successful, I anticipate describing the proposed work and its outcomes in a larger publication which will most likely be submitted to The Astrophysical Journal, one of the top astrophysical journals in the world. Alternatively, depending on the interests of the student, we could prepare a shorter, more technical paper and submit it to a GPU/HPC computing journal (TBD). |
What will the student learn? | The student will learn how deploying GPUs on HPC systems can lead to significant improvements in computing speed, and how those speed-ups directly improve our ability to do science with large astronomical datasets. The student will also improve their Python programming skills and learn how to clearly document and communicate their results to collaborators with a wide range of technical backgrounds. |
What will the mentee learn? | |
What will the Cyberteam program learn from this project? | JAX and JAXOpt are powerful tools for a range of applications in scientific computing, machine learning, artificial intelligence, and much more. The Cyberteam will gain documentation and example code which demonstrates how these codes can be deployed on GPUs on HPCs, and benchmarked, well-documented code which illustrates how that code can be applied to solve one specific class of astrophysics problems. |
HPC resources needed to complete this project? | We will need access to a multi-node GPU system and a modern software architecture with an isolated software environment where all the code dependencies can be installed (Python, JAX, etc.). |
Notes | |
What is the impact on the development of the principal discipline(s) of the project? | |
What is the impact on other disciplines? | |
Is there an impact physical resources that form infrastructure? | |
Is there an impact on the development of human resources for research computing? | |
Is there an impact on institutional resources that form infrastructure? | |
Is there an impact on information resources that form infrastructure? | |
Is there an impact on technology transfer? | |
Is there an impact on society beyond science and technology? | |
Lessons Learned | |
Overall results |