Hi, my name is Mingxuan Song.
I'm currently pursuing my PhD at Peking University, under the supervision of Professor Zhen Xiao.

Know more

About me

Profile Image

I am currently a PhD student in Computer Systems Architecture at the School of Computer Science, Peking University, under the supervision of Professor Zhen Xiao, with an expected graduation in July 2028. I received my bachelor's degree in computer science and technology from China University of Geosciences, Wuhan, in 2023. My research interests include reinforcement learning (RL), sharding blockchain, and large language models (LLMs).

My goal is to continuously grow both professionally and personally, while maintaining a healthy and fulfilling lifestyle. I am also actively seeking internship and job opportunities worldwide. Feel free to contact me!

📍 Location: Beijing, China

🎯 Hobbies: 🏸 Badminton | 🎱 Billiards | 🏓 Table Tennis | 🏃‍♂️ Running

View Resume

📚 Publications

Scholar

  • WWW 2026 CCF-A Mingxuan Song, Yusen Huo, Bohan Zhou, Shenglin Yin, Zhen Xiao*, Jieyi Long, Zhilin Zhang*, Yu Chuan. "DARA: Few-shot Budget Allocation in Online Advertising via In-Context Decision Making with RL-Finetuned LLMs." Proceedings of the Web Conference, April 2026. [PDF]
  • WWW 2025 Oral CCF-A Mingxuan Song, Pengze Li, Bohan Zhou, Shenglin Yin, Zhen Xiao*, Jieyi Long. "AERO: Enhancing Sharding Blockchain via Deep Reinforcement Learning for Account Migration." Proceedings of the Web Conference, May 2025. [PDF]
  • CVPR 2024 CCF-A Shenglin Yin, Zhen Xiao*, Mingxuan Song, and Jieyi Long. "Adversarial Distillation Based on Slack Matching and Attribution Region Alignment." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2024. [PDF]
  • WWW 2024 Oral CCF-A Pengze Li, Mingxuan Song, Mingzhe Xing, Zhen Xiao*, Qiuyu Ding, Shengjie Guan, and Jieyi Long. "SPRING: Improving the Throughput of Sharding Blockchain via Deep Reinforcement Learning Based State Placement." In Proceedings of the Web Conference, May 2024. [PDF]
  • Sensors 2022 JCR Q1 Mingxuan Song, Chengyu Hu*, Wenyin Gong, Xuesong Yan. "Domain Knowledge-Based Evolutionary Reinforcement Learning for Sensor Placement." Sensors 2022. [PDF]
  • 📰 News

    🌟 Projects

    Few-Shot RL Fine-Tuning for LLMs

    Affiliations:
    School of Computer Science, Peking University;
    Alimama, Alibaba Group.

    In recent years, Large Language Models (LLMs) have demonstrated remarkable performance across a variety of natural language processing tasks. However, fine-tuning these models typically requires large-scale datasets and extensive computational resources, which limits their applicability in scenarios where data is scarce and budgets are constrained. This work explores a novel approach to few-shot reinforcement learning (RL) fine-tuning for LLMs, aiming to adapt pre-trained models to specific tasks using minimal supervision.

    Project Image

    RL for Efficient Sharding Blockchain

    Affiliations:
    School of Computer Science, Peking University;
    Theta Labs, Theta Inc.

    Sharding blockchain systems face critical challenges in achieving efficient cross-shard data distribution and maintaining balanced workload across shards. Traditional address allocation methods often suffer from high latency and uneven shard utilization, especially when dealing with dynamically changing transaction patterns and reconfiguration events.

    Project Image

    Evolutionary RL for Sensor Placement in Water Supply Networks

    Affiliations:
    School of Computer Science, China University of Geosciences, Wuhan.

    In water supply networks, effective sensor placement is critical for early detection of contamination events, yet practical deployments are constrained by limited sensor budgets and scarce historical contamination data. This work investigates an evolutionary reinforcement learning formulation of the sensor placement problem, modeling it as a sequential decision-making process under limited supervision. By incorporating domain knowledge into an evolutionary reinforcement learning framework, the proposed approach enables efficient optimization.

    Project Image

    🧑‍🤝‍🧑 Friends

    Shenglin Yin

    ✉️ Contact me

    📧 Email: songmingxuan@stu.pku.edu.cn

    💬 WeChat: smx-scholar

    Call to Action