Minghao Yan
Minghao Yan
Home
Publications
Publications
Type
Conference paper
Preprint
Date
2026
2025
2024
2023
2022
2021
PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution
Large Language Models (LLMs) have emerged as powerful operators for evolutionary search, yet the design of efficient search scaffolds …
Minghao Yan
,
Bo Peng
,
Benjamin Coleman
,
Ziqi Chen
,
Zhouhang Xie
,
Shuo Chen
,
Zhankui He
,
Noveen Sachdeva
,
Isabella Ye
,
Weili Wang
,
Chi Wang
,
Ed H. Chi
,
Fernando Pereira
,
Wang-Cheng Kang
,
Derek Zhiyuan Cheng
,
Beidou Wang
PDF
TABED: Test-Time Adaptive Ensemble Drafting for Robust Speculative Decoding in LVLMs
Speculative decoding (SD) has proven effective for accelerating LLM inference by quickly generating draft tokens and verifying them in …
Minjae Lee
,
Wonjun Kang
,
Byeongkeun Ahn
,
Christian Classen
,
Kevin Galim
,
Seunghyuk Oh
,
Minghao Yan
,
Hyung Il Koo
,
Kangwook Lee
PDF
What Limits Agentic Systems Efficiency?
Large Language Models (LLMs), such as OpenAI-o1 and DeepSeek-R1, have demonstrated strong reasoning capabilities. To further enhance …
Song Bian
,
Minghao Yan
,
Anand Jayarajan
,
Gennady Pekhimenko
,
Shivaram Venkataraman
PDF
Diamond: Harnessing GPU Resources for Scientific Deep Learning
Modern research computing cyberinfrastructure, such as ACCESS-CI and NAIRR Pilot, offers GPU resources across geographically …
Haotian Xie
,
Rohan Marwaha
,
Minu Mathew
,
Song Bian
,
Gengcong Yang
,
Minghao Yan
,
Yadu Babuji
,
Owen Price
,
Yinzhi Wang
,
Volodymyr Kindratenko
,
Shivaram Venkataraman
,
Kyle Chard
,
Ian T. Foster
,
Zhao Zhang
PDF
PLoRA: Efficient LoRA Hyperparameter Tuning for Large Models
Low-rank Adaptation (LoRA) has gained popularity as a fine-tuning approach for Large Language Models (LLMs) due to its low resource …
Minghao Yan
,
Zhuang Wang
,
Zhen Jia
,
Shivaram Venkataraman
,
Yida Wang
PDF
Scaling Inference-Efficient Language Models
Scaling laws are powerful tools to predict the performance of large language models. However, current scaling laws fall short of …
Song Bian
,
Minghao Yan
,
Shivaram Venkataraman
PDF
Humanity’s Last Exam
Scale AI
PDF
Decoding Speculative Decoding
Speculative Decoding is a widely used technique to speed up inference for Large Language Models (LLMs) without sacrificing quality. …
Minghao Yan
,
Saurabh Agarwal
,
Shivaram Venkataraman
PDF
PolyThrottle: Energy-efficient Neural Network Inference on Edge Devices
As neural networks (NN) are deployed across diverse sectors, their energy demand correspondingly grows. While several prior works have …
Minghao Yan
,
Hongyi Wang
,
Shivaram Venkataraman
PDF
Distributed SLIDE: Enabling Training Large Neural Networks on Low Bandwidth and Simple CPU-Clusters via Model Parallelism and Sparsity
More than 70% of cloud computing is paid for but sits idle. A large fraction of these idle compute are cheap CPUs with few cores that …
Minghao Yan
,
Nicholas Meisburger
,
Tharun Medini
,
Anshumali Shrivastava
PDF
PairConnect: A Compute-Efficient MLP Alternative to Attention
Zhaozhuo Xu
,
Minghao Yan
,
Junyan Zhang
,
Anshumali Shrivastava
PDF
Fast Processing and Querying of 170TB of Genomics Data via a Repeated And Merged BloOm Filter (RAMBO)
Gaurav Gupta
,
Minghao Yan
,
Benjamin Coleman
,
Bryce Kille
,
R. A. Leo Elworth
,
Tharun Medini
,
Todd Treangen
,
Anshumali Shrivastava
PDF
Video
Cite
×