Question

In a coin game, you repeatedly toss a biased coin (0.75 for head, 0.25 for tail)....

In a coin game, you repeatedly toss a biased coin (0.75 for head, 0.25 for tail). Each head represent 3 points and tail represents 1 points. You can either Toss or Stop if the total number of points you have tossed is no more than 7. Otherwise, you must Stop. When you Stop, your utility is equal to your total points (up to 7), or zero if you get a total of 8 or higher. When you Toss, you receive no utility. There is no discount ( = 1).

i. What are the states and the actions for this MDP?

State:

Action:

ii. What is the transition function and the reward function for this MDP?

Transition function:

A.

B.

C.

D.

E.

Reward function:

A.

B.

C.

iii. Give an intuitively good policy for this problem (you do not need to calculate the optimal policy).

Homework Answers

Answer #1

(i). What are the states and the actions for this MDP?

State:current points if stop plus a terminal state, that is, 0,1,2,3,4,5,6,7,DONE

Action:Toss,Stop

(ii). What is the transition function and the reward function for this MDP?

Transition function:

T(Si , TOSS, Si+3) = 0.75 if i < 3

T(Si , TOSS, DONE) = 0.75 if i ≥ 3

T(Si , TOSS, Si+1) = 0.25 if i < 7

T(Si , TOSS, DONE) = 0.75 if i = 7

T(Si , STOP, DONE) = 1

Reward function:

R(Si , TOSS, ANY ) = 0

R(Si , STOP, DONE) = i

R(DONE, STOP, DONE) = 0

(iii). Give an intuitively good policy for this problem

Optimal policy: Toss for 0,1,2; STOP for others

The value iteration will converge at iteration 3. Result of iteration 3 is as follow,

V3:

0: 4.5 from Toss; 1: 5.4 from Toss; 2: 5.9 from Toss; 3: 3 from Stop; 4: 4 from Stop; 5: 5 from Stop; 6: 6 from Stop; 7: 7 from Stop

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
A biased coin is tossed repeatedly. The probability of getting head in any particular toss is...
A biased coin is tossed repeatedly. The probability of getting head in any particular toss is 0.3.Assuming that the tosses are independent, find the probability that 3rd head appears exactly at the 10th toss.
You are going to successively flip a coin until the pattern HHT appears; that is until...
You are going to successively flip a coin until the pattern HHT appears; that is until you observe two successive heads followed by a tail. In order to calculate some properties of this game, you set up a Markov Chain with the following states: 0, H, HH, HHT, where 0 represents the starting point, H represents a single observed head on the last flip, HH represents two successive heads on the last two flips, and HHT is the sequence you...
Using the model proposed by Lafley and Charan, analyze how Apigee was able to drive innovation....
Using the model proposed by Lafley and Charan, analyze how Apigee was able to drive innovation. case:    W17400 APIGEE: PEOPLE MANAGEMENT PRACTICES AND THE CHALLENGE OF GROWTH Ranjeet Nambudiri, S. Ramnarayan, and Catherine Xavier wrote this case solely to provide material for class discussion. The authors do not intend to illustrate either effective or ineffective handling of a managerial situation. The authors may have disguised certain names and other identifying information to protect confidentiality. This publication may not be...
What tools could AA leaders have used to increase their awareness of internal and external issues?...
What tools could AA leaders have used to increase their awareness of internal and external issues? ???ALASKA AIRLINES: NAVIGATING CHANGE In the autumn of 2007, Alaska Airlines executives adjourned at the end of a long and stressful day in the midst of a multi-day strategic planning session. Most headed outside to relax, unwind and enjoy a bonfire on the shore of Semiahmoo Spit, outside the meeting venue in Blaine, a seaport town in northwest Washington state. Meanwhile, several members of...
ADVERTISEMENT
Need Online Homework Help?

Get Answers For Free
Most questions answered within 1 hours.

Ask a Question
ADVERTISEMENT