結果 : probability theory in reinforcement learning