avatar
Articles
98
Tags
29
Categories
26

Home
Archives
Tags
Categories
Link
About
detect
Search
Home
Archives
Tags
Categories
Link
About

RL

Category - RL
2024
2024-01-23
《深度强化学习》reading notes
2024-01-20
浅析Alpha-Go
2024-01-19
DRL: Actor-Critic Method
1
avatar
Richard
If you can't explain it simply, you don't understand it well enough.
Articles
98
Tags
29
Categories
26
Follow Me
Announcement
blog is buliding!
Recent Post
JAX base2025-05-06
Python Multiprocess2025-05-05
C++ Embedding Python2025-05-05
Python tips2025-05-01
Pandas Tips2025-05-01
生成式奖励模型的几种方法2025-03-25
Let’s Verify Step by Step2025-03-24
Generative Verifiers, Reward Modeling as Next-Token Prediction2025-03-23
LoRA2025-03-23
GRPO2025-03-23
Categories
  • DL16
    • Lee's HW1
    • Lee's notes14
    • code1
  • Math1
    • Bayesian Network and MCMC1
  • NJU course11
    • Crypto1
Tags
RL GPT diffusion DS python c++ catalog HW note linux Quant Metabit resume 实习 实验报告 机器学习 math ML LLM tool algorithm paper hexo GAN vim 随笔 git 神经网络 OS
Archives
  • May 20255
  • March 202510
  • February 20252
  • January 20256
  • October 20245
  • June 20241
  • May 20243
  • April 20243
  • March 20248
  • February 20246
  • January 202416
  • December 20238
  • November 20237
  • October 20233
  • September 20237
  • July 20233
  • June 20234
  • March 20231
Info
Article :
98
Run time :
Total Count :
260.9k
Last Push :
©2020 - 2025 By Richard
Framework Hexo|Theme Butterfly
Search
Loading the Database