
Welcome to the Parallel Architecture, System, and Algorithm Lab

Campus Photo

University of California, Merced
Parallel Architecture, System, and Algorithm (PASA) Lab at the Electrical Engineering and Computer Science, University of California, Merced performs research in core technologies for large-scale parallel systems. The core theme of our research is to study how to enable scalable and efficient execution of applications (especially machine learning and artificial intelligence workloads) on increasingly complex large-scale parallel systems. Our work creates innovation in runtime, architecture, performance modeling, and programming models; We also investigate the impact of novel architectures (e.g., CXL-based memory and accelerator with massive parallelism) on the designs of applications and runtime. Our goal is to improve the performance, reliability, energy efficiency, and productivity of large-scale parallel systems. PASA is a part of High Performance Computing Systems and Architecture Group at UC Merced
See our Research and Publications pages for more information about our work. For information about our group members, see our People page.
[6/2025] Our work on using the CXL memory for inter-node communication is accepted into SC'25. This work is based on the collaboration with SK Hynix.
[6/2025] Our work on using big memory and memoization to accelerate laminography reconstruction is accepted into SC'25. This work is based on the collaboration with Argonne National Lab (ANL).
[6/2025] Welcome new PhD student, Han Meng! :)
[4/2025] Our team won the 2nd place in ASPLOS'25 / EuroSys'25 Contest on an Optimized Neuron Kernel Interface (NKI) Implementation of Llama 3.2 1B (Inference)!
[1/2025] Our work on the CXL memory evaluation (Performance Characterization of CXL Memory and Its Use Cases) has been accepted to IPDPS'25.
[12/2024] Thanks MICRON for their generous donation of CXL memory hardware!
[10/2024] Our collaborative work with Meta on using memory tiering for recommendation models has been accepted to HPCA'25. :)
[10/2024] Our collaborative work with Microsoft on using memory tiering for GNN has been accepted to HPCA'25.
[6/2024] Our paper on CXL memory (Efficient Tensor Offloading for Large Deep-Learning Model Training based on Compute Express Link) is accepted by SC'24.
[6/2024] Many thanks to AMD for their hardware donation!
[4/2024] Our paper on tiered memory (FlexMem: Adaptive Page Profiling and Migration for Tiered Memory) is accepted by USENIX ATC'24.
[4/2024] Multiple undergraduate/master students (from UC Merced, UMN, UIUC, and Wisconsin) will join the lab for summer internship. :)
[3/2024] Dong Li was invited to join two panels in ExHET'24 and GPGPU'24 workshops associated with PPoPP'24.
[4/2024] Dong Li was invited to give a talk at the Empowering Software through Machine Learning (ESwML) workshop associated with EuroSys'24.
[1/2024] Our paper on tiered memory ("Rethinking Memory Profiling and Migration for Multi-Tiered Large Memory Systems") is accepted by EuroSys'24.