This is
Lee's MindSpace

Biggest dreamer, persistent doer, awful sleeper

Date Title
Dec 14, 2025 How we inference (2025) After a year working at an inference engine start up, I have witnessed the great evolution of inference optimization in 2025. This blog concludes several famous breakthroughs.
Nov 28, 2025 Daily Vibe Coding Share Recently, I have been invited to share some of my experiences with vibe coding. AI has profoundly changed the way I think and how I work. It is a great pleasure to witness such a transformation whe…
Mar 9, 2025 RAG & Agent Share Last week I shared some my exp of RAG and Agent for all members in the company. Here's a desensitized copy of my notes. Though something I cannot share here, but these are enough for starters to le…
Feb 14, 2025 5 Years invention Patent My first-author invention patent just got approved after 5 years! Though technology has evolved since then, I'm still deeply thankful for this milestone. Looking back at those nights spent writing …
Feb 6, 2025 DS-R1 & GRPO Code DSR1 paper https://arxiv.org/pdf/2501.12948 As we have read the GRPO algo in previous blog https://lihaorui.com/2025/02/05/deepseek-math-reading/ here we continue read deepseek r1's training pipeline.
Feb 5, 2025 Deepseek Math Reading [](https://yaih.dawn.ee/image/SPdl) [](https://yaih.dawn.ee/image/SJKt) [](https://yaih.dawn.ee/image/Sqw7) **Proximal Policy Optimization (PPO)** is an actor-critic RL algorithm widely used in the…
Jan 25, 2025 Prompt Attack Defense This blog is a note of Google's prompt attack and defense presentation at 2025 Google Cloud Export Summit Shenzhen. Video Source: 【提示词注入防御最佳实践】 https://www.bilibili.com/video/BV1DLwEeaEDa
Jan 24, 2025 Swarm Code Reading Source code: https://github.com/openai/swarm My opensourced demo of how to modify and use swarm: https://github.com/haoruilee/huggingface-swarm
Jan 24, 2025 Multi Agent RL & Web3 *Update at 2025.02.08: I write an implement of this idea https://github.com/haoruilee/Principal-Agent-Contract* Recently I read this paper *Principal-Agent Reinforcement Learning: Orchestrating AI …
Jan 22, 2025 Tensor Parallelism and NCCL Recommend Reading: https://uvadlc-notebooks.readthedocs.io/en/latest/tutorial_notebooks/scaling/JAX/tensor_parallel_simple.html **Tensor Parallelism** optimizes the computation of large matrix oper…
Jan 22, 2025 From RL to RLHF Note: This blog is a small part of these sources, recommend read them all. $1 categorizes techniques for aligning large language models (LLMs) into **four main themes** with subtopics:
Jan 21, 2025 Invest Harvey Index **"Rich is having money (or assets) you haven't spent."** Over the past year, I've been exploring steady ways to grow wealth SLOWLY BUT SURELY. After countless trials, I've finally developed what I…