Which of the following types of policy is a learning algorithm that evaluates and improves a policy that is dissimilar from the Policy that is used for action selection?

41. Which of the following types of policy is a learning algorithm that evaluates and improves a policy that is dissimilar from the Policy that is used for action selection?

  1. behavior policy
  2. Target policy
  3. On-policy
  4. Off-policy

Answer: D) Off-policy

Explanation:

Off-policy is a type of policy, is a learning algorithm that evaluates and improves a policy that is dissimilar from the Policy that is used for action selection.

Comments and Discussions!

Load comments ↻






Copyright © 2024 www.includehelp.com. All rights reserved.