publications

publications by categories in reversed chronological order. generated by jekyll-scholar.

2026

  1. arXiv
    ProjDevBench: Benchmarking AI Coding Agents on End-to-End Project Development
    Pengrui Lu*, Shiqi Zhang*, Yunzhong Hou*, and 8 more authors
    arXiv preprint arXiv:2602.01655, 2026

2025

  1. arXiv
    InnovatorBench: Evaluating Agents’ Ability to Conduct Innovative LLM Research
    Y. Wu, D. Fu, W. Si, and 9 more authors
    arXiv preprint arXiv:2510.27598, 2025
  2. arXiv
    Interaction as Intelligence Part II: Asynchronous Human-Agent Rollout for Long-Horizon Task Training
    D. Fu, Y. Wu, X. Cai, and 9 more authors
    arXiv preprint arXiv:2510.27630, 2025
  3. arXiv
    Interaction as Intelligence: Deep Research With Human-AI Partnership
    L. Ye, X. Cai, X. Wang, and 8 more authors
    arXiv preprint arXiv:2507.15759, 2025
  4. arXiv
    ParaCook: On Time-Efficient Planning for Multi-Agent Systems
    S. Zhang, X. Ma, Y. Xu, and 7 more authors
    arXiv preprint arXiv:2510.11608, 2025
  5. arXiv
    ResearcherBench: Evaluating Deep AI Research Systems on the Frontiers of Scientific Inquiry
    Tianze Xu*, Pengrui Lu*, Lyumanshan Ye, and 2 more authors
    arXiv preprint arXiv:2507.16280, 2025
  6. arXiv
    DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments
    Yuxiang Zheng, Dayuan Fu, Xiangkun Hu, and 4 more authors
    arXiv preprint arXiv:2504.03160, 2025
  7. arXiv
    Deep Cognition: A Multi-Agent Framework for Collaborative Research with Real-Time Cognitive Oversight
    L. Ye, X. Cai, X. Wang, and 8 more authors
    arXiv preprint, 2025