In a coin game, you repeatedly toss a biased coin (0.75 for head, 0.25 for tail)....

Question

Question

In a coin game, you repeatedly toss a biased coin (0.75 for head, 0.25 for tail)....

In a coin game, you repeatedly toss a biased coin (0.75 for head, 0.25 for tail). Each head represent 3 points and tail represents 1 points. You can either Toss or Stop if the total number of points you have tossed is no more than 7. Otherwise, you must Stop. When you Stop, your utility is equal to your total points (up to 7), or zero if you get a total of 8 or higher. When you Toss, you receive no utility. There is no discount ( = 1).

i. What are the states and the actions for this MDP?

State:

Action:

ii. What is the transition function and the reward function for this MDP?

Transition function:

A.

B.

C.

D.

E.

Reward function:

A.

B.

C.

iii. Give an intuitively good policy for this problem (you do not need to calculate the optimal policy).

Engineering Computer-Science

0 0

Add a comment Transcribed image text

Answer 1

Answer #1

(i). What are the states and the actions for this MDP?

State:current points if stop plus a terminal state, that is, 0,1,2,3,4,5,6,7,DONE

Action:Toss,Stop

(ii). What is the transition function and the reward function for this MDP?

Transition function:

T(Si , TOSS, Si+3) = 0.75 if i < 3

T(Si , TOSS, DONE) = 0.75 if i ≥ 3

T(Si , TOSS, Si+1) = 0.25 if i < 7

T(Si , TOSS, DONE) = 0.75 if i = 7

T(Si , STOP, DONE) = 1

Reward function:

R(Si , TOSS, ANY ) = 0

R(Si , STOP, DONE) = i

R(DONE, STOP, DONE) = 0

(iii). Give an intuitively good policy for this problem

Optimal policy: Toss for 0,1,2; STOP for others

The value iteration will converge at iteration 3. Result of iteration 3 is as follow,

V3:

0: 4.5 from Toss; 1: 5.4 from Toss; 2: 5.9 from Toss; 3: 3 from Stop; 4: 4 from Stop; 5: 5 from Stop; 6: 6 from Stop; 7: 7 from Stop

0 0

Add a comment

In a coin game, you repeatedly toss a biased coin (0.75 for head, 0.25 for tail)....

Homework Answers

Post as a guest

Earn Coins

Not the answer you're looking for?

Similar Questions

A biased coin is tossed repeatedly. The probability of getting head in any particular toss is...

A coin is tossed repeatedly; on each toss, a head is shown with probability p or...

You are going to successively flip a coin until the pattern HHT appears; that is until...

Using the model proposed by Lafley and Charan, analyze how Apigee was able to drive innovation....

What tools could AA leaders have used to increase their awareness of internal and external issues?...

Need Online Homework Help?

Active Questions