Which of the following algorithms will find the best course of action, based on the agent's current state, without using a model and off-policy reinforcement learning?

36. Which of the following algorithms will find the best course of action, based on the agent's current state, without using a model and off-policy reinforcement learning?

  1. Q-learning
  2. Markov property
  3. State action reward state action
  4. Deep Q neural network

Answer: A) Q-learning

Explanation:

A Q-learning algorithm will find the best course of action, based on the agent's current state, without using a model and off-policy reinforcement learning.

Comments and Discussions!

Load comments ↻






Copyright © 2024 www.includehelp.com. All rights reserved.