I am a senior undergraduate student in Computer Science at Shanghai Jiao Tong University (SJTU), where I am a member of the ACM Honors Class — an elite program for the top 5% of students with a spirit of innovation and challenge.
My research interests focus on LLM/AI evaluation and benchmarking, including building robust evaluation frameworks for large language models, multimodal systems, and AI coding agents.
An end-to-end benchmark that evaluates AI coding agents on complete project development, combining Online Judge testing with LLM-assisted code review across 20 programming problems in 8 categories.
@article{lu2026projdevbench,title={ProjDevBench: Benchmarking AI Coding Agents on End-to-End Project Development},author={Lu, Pengrui and Zhang, Shiqi and Hou, Yunzhong and Ye, Lyumanshan and Huang, Chaoyi and Chen, Zixi and Zeng, Ji and Jiang, Hantao and Liu, Pengfei and Wang, Yiwei and Yang, Ming-Hsuan},year={2026},journal={arXiv preprint arXiv:2602.01655},}
arXiv
ResearcherBench: Evaluating Deep AI Research Systems on the Frontiers of Scientific Inquiry
Tianze Xu*, Pengrui Lu*, Lyumanshan Ye, and 2 more authors
The first benchmark focused on evaluating the capabilities of Deep AI Research Systems (DARS) on frontier AI scientific questions, featuring 65 expertly curated research questions across 35 distinct AI research subjects with a dual assessment framework.
@article{lu2025researcherbench,title={ResearcherBench: Evaluating Deep AI Research Systems on the Frontiers of Scientific Inquiry},author={Xu, Tianze and Lu, Pengrui and Ye, Lyumanshan and Hu, Xiangkun and Liu, Pengfei},year={2025},journal={arXiv preprint arXiv:2507.16280},}
arXiv
DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments
Yuxiang Zheng, Dayuan Fu, Xiangkun Hu, and 4 more authors
The first comprehensive framework for end-to-end training of LLM-based deep research agents through scaling reinforcement learning in real-world environments with authentic web search interactions.
@article{zheng2025deepresearcher,title={DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments},author={Zheng, Yuxiang and Fu, Dayuan and Hu, Xiangkun and Cai, Xiaojie and Ye, Lyumanshan and Lu, Pengrui and Liu, Pengfei},year={2025},journal={arXiv preprint arXiv:2504.03160},}