Discuss the following question in the discussion forum: What are the differences between the value-iteration algorithm and Q-learning?
Differences between the value-iteration algorithm and Q-learning:
- The value-iteration algorithm is designed to calculate learning parameters in the case where Markov Decision Process probabilities are known. In Q-learning, the probabilities are not known.
- As such, the MDP probabilities that are unknown are estimated dynamically, and the Q and V values at the same time.
Discussion forums are group tools that make it easy to discuss things with others in the group. Remember that discussions can also evolve around most things on the Landing including blogs, wikis, bookmarks, files, events and polls.
We welcome comments on public posts from members of the public. Please note, however, that all comments made on public posts must be moderated by their owners before they become visible on the site. The owner of the post (and no one else) has to do that.
If you want the full range of features and you have a login ID, log in using the links at the top of the page or at https://landing.athabascau.ca/login (logins are secure and encrypted)
Posts made here are the responsibility of their owners and may not reflect the views of Athabasca University.