This is
Lee's MindSpace

Biggest dreamer, persistent doer, awful sleeper

Latest writing

Date	Title
Dec 14, 2025	How we inference (2025) After a year working at an inference engine start up, I have witnessed the great evolution of inference optimization in 2025. This blog concludes several famous breakthroughs.
Nov 28, 2025	Daily Vibe Coding Share Recently, I have been invited to share some of my experiences with vibe coding. AI has profoundly changed the way I think and how I work. It is a great pleasure to witness such a transformation whe…
Mar 9, 2025	RAG & Agent Share Last week I shared some my exp of RAG and Agent for all members in the company. Here's a desensitized copy of my notes. Though something I cannot share here, but these are enough for starters to le…
Feb 14, 2025	5 Years invention Patent My first-author invention patent just got approved after 5 years! Though technology has evolved since then, I'm still deeply thankful for this milestone. Looking back at those nights spent writing …
Feb 6, 2025	DS-R1 & GRPO Code DSR1 paper https://arxiv.org/pdf/2501.12948 As we have read the GRPO algo in previous blog https://lihaorui.com/2025/02/05/deepseek-math-reading/ here we continue read deepseek r1's training pipeline.
Feb 5, 2025	Deepseek Math Reading [](https://yaih.dawn.ee/image/SPdl) [](https://yaih.dawn.ee/image/SJKt) [](https://yaih.dawn.ee/image/Sqw7) Proximal Policy Optimization (PPO) is an actor-critic RL algorithm widely used in the…
Jan 25, 2025	Prompt Attack Defense This blog is a note of Google's prompt attack and defense presentation at 2025 Google Cloud Export Summit Shenzhen. Video Source: 【提示词注入防御最佳实践】 https://www.bilibili.com/video/BV1DLwEeaEDa
Jan 24, 2025	Swarm Code Reading Source code: https://github.com/openai/swarm My opensourced demo of how to modify and use swarm: https://github.com/haoruilee/huggingface-swarm
Jan 24, 2025	Multi Agent RL & Web3 Update at 2025.02.08: I write an implement of this idea https://github.com/haoruilee/Principal-Agent-Contract Recently I read this paper *Principal-Agent Reinforcement Learning: Orchestrating AI …
Jan 22, 2025	Tensor Parallelism and NCCL Recommend Reading: https://uvadlc-notebooks.readthedocs.io/en/latest/tutorial_notebooks/scaling/JAX/tensor_parallel_simple.html Tensor Parallelism optimizes the computation of large matrix oper…
Jan 22, 2025	From RL to RLHF Note: This blog is a small part of these sources, recommend read them all. $1 categorizes techniques for aligning large language models (LLMs) into four main themes with subtopics:
Jan 21, 2025	Invest Harvey Index "Rich is having money (or assets) you haven't spent." Over the past year, I've been exploring steady ways to grow wealth SLOWLY BUT SURELY. After countless trials, I've finally developed what I…