News Score: Score the News, Sort the News, Rewrite the Headlines

A (Long) Peek into Reinforcement Learning

[Updated on 2020-09-03: Updated the algorithm of SARSA and Q-learning so that the difference is more pronounced. [Updated on 2021-09-19: Thanks to 爱吃猫的鱼, we have this post in Chinese]. A couple of exciting news in Artificial Intelligence (AI) has just happened in recent years. AlphaGo defeated the best professional human player in the game of Go. Very soon the extended algorithm AlphaGo Zero beat AlphaGo by 100-0 without supervised learning on human knowledge. Top professional game players lost...

Read more at lilianweng.github.io

© News Score  score the news, sort the news, rewrite the headlines