How does Q-learning different from the state value-based learning in Reinforcement Learning
In Q-learning functions perform randomly that do not need any policy.It follows greedy approach.
It is defined for state and action.Q(S,A) to determine of it is good to take action A at state S.
It is a policy reinforcement learning ,so that the best action could be taken for a current state.
It is a model free learning, where agent does not know anything about transition. It discovers about good and bad action by trial and error.
But
In state value based learning the agent has prior knowledge about effect of its action.It is total reward starting from state S and it acts according to some policy.
It calculates cumulative score for each state and state with maximum value gets selected.
Conclusion
Q learning is based on learning policy to take the best action while value based performs according to predefined policy.
Get Answers For Free
Most questions answered within 1 hours.