CS221 Midterm 3 (d) [3 points] An MDP has a reward function R, optimal value function V∗ and optimalpolicy π∗. Consider new reward functions: i. R1(s) = R(s) + 10. ii. A reward function R2 such that whenever R(s1) > R (s2) for two states s1 and s2, then we also have R2(s1)> R2(s2).