結果 : explain q learning algorithm with example