Reinforcement Learning Multiple-Choice Questions (MCQs)

Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. (Read More.)

Reinforcement Learning MCQs: This section contains multiple-choice questions and answers on the various topics of Reinforcement Learning. Practice these MCQs to test and enhance your skills on Reinforcement Learning.

List of Reinforcement Learning MCQs

1. Reinforcement learning is a ____

  1. Prediction-based learning technique
  2. Feedback-based learning technique
  3. History results-based learning technique

Answer: B) Feedback-based learning technique

Explanation:

Reinforcement learning is a feedback-based learning technique.

Discuss this Question


2. How many types of feedback does reinforcement provide?

  1. 1
  2. 2
  3. 3
  4. 4

Answer: B) 2

Explanation:

Reinforcement learning gives two types of feedback: positive and negative.

Discuss this Question


3. Which kind of data does reinforcement learning use?

  1. Labeled data
  2. Unlabelled data
  3. None
  4. Both

Answer: C) None

Explanation:

Reinforcement learning does not use any type of data.

Discuss this Question


4. Reinforcement learning methods learned through ____?

  1. Experience
  2. Predictions
  3. Analyzing the data

Answer: A) Experience

Explanation:

Reinforcement learning learns through experience.

Discuss this Question


5. How many types of machine learning are there?

  1. 2
  2. 3
  3. 4
  4. 5

Answer: C) 4

Explanation:

Four types of machine learning are there: Supervised, unsupervised, semi-supervised, and reinforcement.

Discuss this Question


6. Which of the following is the practical example of reinforcement learning?

  1. House pricing prediction
  2. Market basket analysis
  3. Text classification
  4. Driverless cars

Answer: D) Driverless cars

Explanation:

Driverless cars are the product of reinforcement learning concepts.

Discuss this Question


7. What is an agent in reinforcement learning?

  1. Agent is the situation in which rewards are being exchanged
  2. Agent is the simple value in reinforcement learning.
  3. An agent is an entity that explores the environment.

Answer: C) An agent is an entity that explores the environment.

Explanation:

An agent is an entity that explores the environment.

Discuss this Question


8. What is the environment in reinforcement learning?

  1. Environment is a situation that is based on the current state
  2. Environment is a situation in which an agent is present.
  3. Environment is similar to feedback
  4. Environment is a situation that the agent returns as a result.

Answer: B) Environment is a situation in which an agent is present.

Explanation:

Environment is a situation in which an agent is present.

Discuss this Question


9. What are actions in reinforcement learning?

  1. Actions are the moves that the agent takes inside the environment.
  2. Actions are the function that the environment takes.
  3. Actions are the feedback that an agent provides.

Answer: A) Actions are the moves that the agent takes inside the environment.

Explanation:

Actions are the moves that the agent takes inside the environment.

Discuss this Question


10. What is the state of reinforcement learning?

  1. State is a situation in which an agent is present.
  2. A state is the simple value of reinforcement learning.
  3. A state is a result returned by the environment after an agent takes an action.

Answer: C) A state is a result returned by the environment after an agent takes an action.

Explanation:

A state is a result returned by the environment after an agent takes an action.

Discuss this Question


11. What are the Rewards of Reinforcement learning?

  1. An agent's action is evaluated based on feedback returned from the environment.
  2. Environment gives value in return which is known as a reward.
  3. A reward is a result returned by the environment after an agent takes an action.

Answer: A) An agent's action is evaluated based on feedback returned from the environment.

Explanation:

An agent's action is evaluated based on feedback returned from the environment is known as rewards.

Discuss this Question


12. What is the Policy in reinforcement learning?

  1. The agent's policy determines what environment model should be decided
  2. The agent's policy determines what action to take based on the current state.
  3. The agent's policy determines what the state reward would be.

Answer: B) The agent's policy determines what action to take based on the current state.

Explanation:

The agent's policy determines what action to take based on the current state.

Discuss this Question


13. Does reinforcement learning follow the concept of the Hit and try method?

  1. Yes
  2. No

Answer: A) YES

Explanation:

Yes, reinforcement learning follows the concept of the hit-and-try method.

Discuss this Question


14. In how many ways can you implement reinforcement learning?

  1. 2
  2. 3
  3. 4
  4. 5

Answer: B) 3

Explanation:

In three ways we can implement reinforcement learning:

  • Value-based
  • Policy-based
  • Model-based

Discuss this Question


15. In which of the following approaches of reinforcement learning, do we find the optimal value function?

  1. Value-based
  2. Policy-based
  3. Model-based

Answer: A) Value-based

Explanation:

In a Value-based approach to reinforcement learning, we find the optimal value function.

Discuss this Question


16. How many types of policy-based approaches are there in reinforcement learning?

  1. 1
  2. 2
  3. 3
  4. 4

Answer: B) 2

Explanation:

There are two types of policy-based approaches:

  • Deterministic
  • Stochastic

Discuss this Question


17. In which of the following approaches of reinforcement learning, a virtual model is created for the environment?

  1. Value-based
  2. Policy-based
  3. Model-based

Answer: C) Model-based

Explanation:

Model-based approach of reinforcement learning, a virtual model is created for the environment.

Discuss this Question


18. ____ is a synonym for random and probabilistic?

  1. Deterministic
  2. Stochastic

Answer: B) Stochastic

Explanation:

Stochastic is a synonym for random and probabilistic variables.

Discuss this Question


19. How many elements does reinforcement learning consist of?

  1. 2
  2. 3
  3. 4
  4. 5

Answer: C) 4

Explanation:

Mainly there are four types of reinforcement learning:

  • Policy
  • Reward Signal
  • Value Function
  • Model of the environment

Discuss this Question


20. The agent's main objective is to ____the total number of rewards for good actions.?

  1. Minimize
  2. Maximize
  3. Null

Answer: B) Maximize

Explanation:

The agent's main objective is to maximize the total number of rewards for good actions.

Discuss this Question


21. Reinforcement learning is defined by the ____?

  1. Policy
  2. Reward Signal
  3. Value Function
  4. Model of the environment

Answer: B) Reward Signal

Explanation:

Reinforcement learning is defined by the Reward signal.

Discuss this Question


22. Which element in reinforcement learning defines the behavior of the agent?

  1. Policy
  2. Reward Signal
  3. Value Function
  4. Model of the environment

Answer: A) Policy

Explanation:

Policy elements in reinforcement learning define the behavior of the agent.

Discuss this Question


23. Can reward signals change the policy?

  1. Yes
  2. No

Answer: A) YES

Explanation:

Reward signals can change the policy.

Discuss this Question


24. On which of the following elements of reinforcement learning, the reward that an agent can expect is dependent?

  1. Policy
  2. Reward Signal
  3. Value Function
  4. Model of the environment

Answer: C) Value Function

Explanation:

On the value function, the reward that the agent can expect is dependent.

Discuss this Question


25. Which of the following elements of reinforcement learning imitates the behavior of the environment?

  1. Policy
  2. Reward Signal
  3. Value Function
  4. Model of the environment

Answer: D) Model of the environment

Explanation:

The model imitates the behavior of the environment.

Discuss this Question


26. The approach in which reinforcement learning problems are solved with the help of models is known as ____?

  1. Model-based approach
  2. Model-free approach
  3. Model known approach

Answer: A) Model-based approach

Explanation:

The approach in which reinforcement learning problems are solved with the help of models is known as model-based approach.

Discuss this Question


27. Who introduced the Bellman equation?

  1. Richard Ernest Bellman
  2. Alfonso Shimbel
  3. Edsger W. Dijkstra

Answer: A) Richard Ernest Bellman

Explanation:

Richard Ernest Bellman introduced the Bellman equation.

Discuss this Question


28. Gamma (γ) in the bellman equation is known as?

  1. Value factor
  2. Discount factor
  3. Environment factor

Answer: B) Discount factor

Explanation:

Gamma (γ) in the bellman equation is known as the Discount factor.

Discuss this Question


29. How many types of reinforcement learning?

  1. 3
  2. 4
  3. 2
  4. 5

Answer: C) 2

Explanation:

There are two types of reinforcement learning:

  • Positive Reinforcement
  • Negative Reinforcement

Discuss this Question


30. In which of the following types of reinforcement learning do we add something that increases the likelihood of repeating expected behavior?

  1. Positive Reinforcement
  2. Negative Reinforcement

Answer: A) Positive Reinforcement

Explanation:

In positive reinforcement learning types of reinforcement learning we add something that increases the likelihood of repeating expected behavior.

Discuss this Question


31. How do you represent the agent state in reinforcement learning?

  1. Discount state
  2. Discount factor
  3. Markov state

Answer: C) Markov state

Explanation:

Represent the agent state in reinforcement learning Markov state.

Discuss this Question


32. P[St+1 | St ] = P[St +1 | S1,......, St], in this condition
What is the meaning of St?

  1. State factor
  2. Discount factor
  3. Markov state

Answer: C) Markov state

Explanation:

P[St+1 | St ] = P[St +1 | S1,......, St], in the following condition St represents the Markov state.

Discuss this Question


33. What do you mean by MDP in reinforcement learning?

  1. Markov discount procedure
  2. Markov discount process
  3. Markov deciding procedure
  4. Markov decision process

Answer: D) Markov decision process

Explanation:

MDP stands for Markov decision process.

Discuss this Question


34. Why do we use MDP in reinforcement learning?

  1. We use MDP to formalize the reinforcement learning problems.
  2. We use MDP to predict reinforcement learning problems.
  3. We use MDP to analyze the reinforcement learning problems.

Answer: A) We use MDP to formalize the reinforcement learning problems.

Explanation:

We use MDP to formalize the reinforcement learning problems.

Discuss this Question


35. How many tuples does MDP consist of?

  1. 2
  2. 3
  3. 4
  4. 5

Answer: C) 4

Explanation:

MDP consists of 4 tuples:

  • A set of finite States S
  • A set of finite Actions A
  • Rewards received after transitioning from state S to state S', due to action a.
  • Probability Pa.

Discuss this Question


36. Which of the following algorithms will find the best course of action, based on the agent's current state, without using a model and off-policy reinforcement learning?

  1. Q-learning
  2. Markov property
  3. State action reward state action
  4. Deep Q neural network

Answer: A) Q-learning

Explanation:

A Q-learning algorithm will find the best course of action, based on the agent's current state, without using a model and off-policy reinforcement learning.

Discuss this Question


37. What do you mean by SARSA in reinforcement learning?

  1. State action reward state action
  2. State achievement rewards state action
  3. State act reward achievement
  4. State act reward act

Answer: A) State action reward state action

Explanation:

SARSA stands for State action reward state action.

Discuss this Question


38. ___ is the policy that an agent is trying to learn?

  1. behavior policy
  2. Target policy
  3. On-policy
  4. Off-policy

Answer: B) Target policy

Explanation:

A target policy is a type of policy that an agent is trying to learn.

Discuss this Question


39. ____- is the policy which is used by an agent for action selection?

  1. behavior policy
  2. Target policy
  3. On-policy
  4. Off-policy

Answer: A) behavior policy

Explanation:

Behavior policy is used by an agent for action selection.

Discuss this Question


40. Which of the following type of policy is a learning algorithm in which the same policy is improved and evaluated?

  1. behavior policy
  2. Target policy
  3. On-policy
  4. Off-policy

Answer: C) On-policy

Explanation:

On-policy type of policy is a learning algorithm in which the same policy is improved and evaluated.

Discuss this Question


41. Which of the following types of policy is a learning algorithm that evaluates and improves a policy that is dissimilar from the Policy that is used for action selection?

  1. behavior policy
  2. Target policy
  3. On-policy
  4. Off-policy

Answer: D) Off-policy

Explanation:

Off-policy is a type of policy, is a learning algorithm that evaluates and improves a policy that is dissimilar from the Policy that is used for action selection.

Discuss this Question


42. Among On-policy and off-policy, which of the following target policy is not equal to behavior policy?

  1. On-policy
  2. Off-policy

Answer: B) Off-policy

Explanation:

In an off-policy learning algorithm target policy is not equal to behavior policy.

Discuss this Question


43. Among On-policy and off-policy, which of the following target policy is equal to behavior policy?

  1. On-policy
  2. Off-policy

Answer: A) On-policy

Explanation:

In the on-policy learning algorithm target policy is equal to behavior policy.

Discuss this Question


44. Q-learning follows an on-policy learning algorithm or an off-policy learning algorithm?

  1. On-policy
  2. Off-policy

Answer: B) Off-policy

Explanation:

Q-learning is based on an off-policy learning algorithm.

Discuss this Question


45. SARSA follows an on-policy learning algorithm or an off-policy learning algorithm?

  1. On-policy
  2. Off-policy

Answer: A) On-policy

Explanation:

SARSA is based upon an on-policy learning algorithm.

Discuss this Question


46. What is DQN in reinforcement learning?

  1. Dynamic Q-learning network
  2. Dynamic Q-neural network
  3. Deep Q-neural network

Answer: C) Deep Q-neural network

Explanation:

DQN stands for Deep Q-neural network.

Discuss this Question


47. Which of the following correctly states the difference between Q-learning and SARSA?

  1. In comparison to SARSA, QL directly learns the optimal policy, whereas SARSA learns a policy that is "near" the optimal
  2. In comparison to QL, SARSA directly learns the optimal policy, whereas QL learns a policy that is "near" the optimal.

Answer: A) In comparison to SARSA, QL directly learns the optimal policy, whereas SARSA learns a policy that is "near" the optimal

Explanation:

In comparison to SARSA, QL directly learns the optimal policy, whereas SARSA learns a policy that is "near" the optimal.

Discuss this Question


48. Which of the following gives the better final performance?

  1. QL
  2. SARSA

Answer: A) QL

Explanation:

Q-learning (QL) gives a better final performance.

Discuss this Question


49. Which of the following is faster?

  1. QL
  2. SARSA

Answer: B) SARSA

Explanation:

SARSA is faster.

Discuss this Question


50. Q-learning is a model-free or model-based learning algorithm?

  1. Model-free
  2. Model-based

Answer: A) Model-free

Explanation:

Q-learning is a model-free learning algorithm.

Discuss this Question


51. What does Q stand for in Q-learning?

  1. Quality
  2. Query
  3. Quantify
  4. Quick

Answer: A) Quality

Explanation:

In Q-learning "Q" stands for quality.

Discuss this Question


52. The matrix created during the Q-learning algorithm is commonly known as ____?

  1. Query-table
  2. Q-table
  3. Quick-matrix
  4. Table

Answer: B) Q-table

Explanation:

The matrix created during the Q-learning algorithm is commonly known as the q-table.

Discuss this Question


53. Does reinforcement learning provide any previous training?

  1. Yes
  2. No

Answer: B) NO

Explanation:

No, reinforcement learning does not require any previous training.

Discuss this Question


54. Q-learning works on which equation?

  1. Naïve bayes equation
  2. KNN-equation
  3. Bellman-equation

Answer: C) Bellman-equation

Explanation:

Q-learning works on the Bellman equation.

Discuss this Question





Comments and Discussions!

Load comments ↻






Copyright © 2024 www.includehelp.com. All rights reserved.