Question

In this assignment, you will write pseudo-code for Markov Decision Process. A Markov Decision Process also...

In this assignment, you will write pseudo-code for Markov Decision Process.

A Markov Decision Process also known as MDP model contains the following set of features:

  • A set of possible states S.
  • A set of Models.
  • A set of possible actions A.
  • A real valued reward function R (s, a).
  • A solution of Markov Decision Process.

Consider the following Grid (3 by 3):


Fire

Diamond
3   




2

Start

Blocked
1
1 2 3

An agent lives in a grid. It starts at grid number (1 * 1) and can roam around in the grid using the following actions:

UP, DOWN, LEFT, RIGHT

The goal of the agent is to reach the grid number (3 * 3) with the diamond state.

The agent must avoid the fire state at grid number (3 * 1) at any cost.

Also, there is a block grid at (1 * 3) state, which the agent can’t pass and must choose an alternate route.

The agent cannot pass a wall. For example, in the starting grid (1 * 1), the agent can only go either UP or RIGHT.

Based on the above information, write a pseudo-code in Java or Python to solve the problem using the Markov decision process.

Your pseudo-code must do the following

  1. Implementation of a static environment (grid) using an array or other data structure that will represent the above grid.
  2. A function/method to determine what action to take. The decision should be based upon Markov Decision Process.
  3. Consideration of reward policy that incorporates the action costs in addition to any prizes or penalties that may be awarded.
  4. A function/method to calculate the optimal policy when a blocked state is encountered.
  5. A function/method to calculate the optimal policy when the fire state is encountered.
  6. A function/method to test if the desired goal is achieved or not.

Homework Answers

Answer #1
//Pseudocode for solving it using Markov Decision Process.



procedure value_iteration(P,r, θ )
inputs:
   P is state transition function specifying P(s'|a,s)
   r is a reward function R(s, a,s')
   θ a threshold θ > 0
returns:
   π[s] approximately optimal policy
   V[s] value function
data structures:
   Vk [s] a sequence of value functions
begin
   for k = 1 : ∞
      for each state s
         Vk [s] = max(a) Summation of s' P(s'|a,s)(R(s, a,s') + γ Vk−1[s'])
      if ∀s |Vk (s) − Vk−1(s)| < θ
         for each state s
            π(s) = arg max(a) Summation of s' P(s'|a,s)(R(s, a,s') + γ Vk−1[s'])
         return π, Vk
end
Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
In a coin game, you repeatedly toss a biased coin (0.75 for head, 0.25 for tail)....
In a coin game, you repeatedly toss a biased coin (0.75 for head, 0.25 for tail). Each head represent 3 points and tail represents 1 points. You can either Toss or Stop if the total number of points you have tossed is no more than 7. Otherwise, you must Stop. When you Stop, your utility is equal to your total points (up to 7), or zero if you get a total of 8 or higher. When you Toss, you receive...
5.Paralanguage includes yawning and laughing. True False 6.A person with the director style of communication likes...
5.Paralanguage includes yawning and laughing. True False 6.A person with the director style of communication likes to know the whole picture and thrives on details. True False 7.Effective feedback is evaluative rather than descriptive. True False 8.Social cognitive theory posits that humans often evaluate themselves quite differently than others do. True False 9.Vertical communication is required to make a coordinated, cross functional effort in achieving organizational goals. True False 10.The informal system of communication has a highly impersonal orientation. True...
I've posted this question like 3 times now and I can't seem to find someone that...
I've posted this question like 3 times now and I can't seem to find someone that is able to answer it. Please can someone help me code this? Thank you!! Programming Project #4 – Programmer Jones and the Temple of Gloom Part 1 The stack data structure plays a pivotal role in the design of computer games. Any algorithm that requires the user to retrace their steps is a perfect candidate for using a stack. In this simple game you...
I'm currently stuck on Level 3 for the following assignment. When passing my program through testing...
I'm currently stuck on Level 3 for the following assignment. When passing my program through testing associated with the assignment it is failing one part of testing.   Below is the test that fails: Failed test 4: differences in output arguments: -c input data: a b c -c expected stdout: b observed stdout: a b expected stderr: observed stderr: ./test: invalid option -- 'c' Unsure where I have gone wrong. MUST BE WRITTEN IN C++ Task Level 1: Basic operation Complete...
Please read the article and answear about questions. Determining the Value of the Business After you...
Please read the article and answear about questions. Determining the Value of the Business After you have completed a thorough and exacting investigation, you need to analyze all the infor- mation you have gathered. This is the time to consult with your business, financial, and legal advis- ers to arrive at an estimate of the value of the business. Outside advisers are impartial and are more likely to see the bad things about the business than are you. You should...
Can you paraphrase the following rationale for me: Male Reproductive System 1. Which of the following...
Can you paraphrase the following rationale for me: Male Reproductive System 1. Which of the following accurately describes spermiogenesis? a. Occurs before puberty b. Involves stem cells, meiosis, and spermatogenesis c. Involves cytodifferentiation of early spermatids d. Occurs in diploid cells e. Results in the formation of primary spermatocytes Rationale: Spermatogenesis, the first part of sperm production involves mainly mitosis and meiosis and is followed by spermiogenesis, the final differentiation process occurring in the haploid male germ cells. Spermatogenesis begins...
QUESTION 1 1. Brianna is trying to increase her chances of being promoted to vice president...
QUESTION 1 1. Brianna is trying to increase her chances of being promoted to vice president by working to build good work relationships with other managers outside her own department. Brianna's behavior should be viewed as dysfunctional politics. functional politics. coercive power. functional influence. 2 points QUESTION 2 1. The Gingerbread Factory has a separate unit that makes their chocolate crunch cookies and another unit that is completely responsible for all operations in producing their ginger snap cookies. The Gingerbread...
Using the model proposed by Lafley and Charan, analyze how Apigee was able to drive innovation....
Using the model proposed by Lafley and Charan, analyze how Apigee was able to drive innovation. case:    W17400 APIGEE: PEOPLE MANAGEMENT PRACTICES AND THE CHALLENGE OF GROWTH Ranjeet Nambudiri, S. Ramnarayan, and Catherine Xavier wrote this case solely to provide material for class discussion. The authors do not intend to illustrate either effective or ineffective handling of a managerial situation. The authors may have disguised certain names and other identifying information to protect confidentiality. This publication may not be...
Sign In INNOVATION Deep Change: How Operational Innovation Can Transform Your Company by Michael Hammer From...
Sign In INNOVATION Deep Change: How Operational Innovation Can Transform Your Company by Michael Hammer From the April 2004 Issue Save Share 8.95 In 1991, Progressive Insurance, an automobile insurer based in Mayfield Village, Ohio, had approximately $1.3 billion in sales. By 2002, that figure had grown to $9.5 billion. What fashionable strategies did Progressive employ to achieve sevenfold growth in just over a decade? Was it positioned in a high-growth industry? Hardly. Auto insurance is a mature, 100-year-old industry...
What tools could AA leaders have used to increase their awareness of internal and external issues?...
What tools could AA leaders have used to increase their awareness of internal and external issues? ???ALASKA AIRLINES: NAVIGATING CHANGE In the autumn of 2007, Alaska Airlines executives adjourned at the end of a long and stressful day in the midst of a multi-day strategic planning session. Most headed outside to relax, unwind and enjoy a bonfire on the shore of Semiahmoo Spit, outside the meeting venue in Blaine, a seaport town in northwest Washington state. Meanwhile, several members of...
ADVERTISEMENT
Need Online Homework Help?

Get Answers For Free
Most questions answered within 1 hours.

Ask a Question
ADVERTISEMENT