1. Understand the concepts of a random variable and a probability distribution. 2. Be able to distinguish between discrete and continuous random variables. 3. Calculate probabilities for multiple random variables based on joint probability functions 4. Be able to compute and interpret the expected value, variance, and standard deviation for a discrete random variable. Learning Objectives DISCRETE RANDOM VARIABLES : S → R. e.g., Y (T T H) = 3 , 71 NOTE : In this example X(s) and Y (s) are actually integers . 0 if the sequence has no H , i.e., Y (T T T ) = 0 . • Y (s) = The index of the first H ; • X(s) = the number of Heads in the sequence ; e.g., X(HT H) = 2 , and examples of random variables are S = {HHH , HHT , HT H , HT T , T HH , T HT , T T H , T T T } , EXAMPLE : Toss a coin 3 times in sequence. The sample space is X(·) DEFINITION : A discrete random variable is a function X(s) from a finite or countably infinite sample space S to the real numbers : DISCRETE RANDOM VARIABLES correspond to {HHH , HHT , HT H , T HH} . corresponds to the event {HHT , HT H , T HH} , X(s) = the number of Heads , X 72 instead of X(s) . NOTATION : If it is clear what S is then we often just write 1 < X(s) ≤ 3 , and the values X(s) = 2 , the value with S = {HHH , HHT , HT H , HT T , T HH , T HT , T T H , T T T } , EXAMPLE : For the sample space Value-ranges of a random variable correspond to events in S . Value-ranges of a random variable have a probability . events in S have a probability . P (0 < X ≤ 2) = 73 P (X ≤ −1) , P (X ≤ 0) , P (X ≤ 1) , P (X ≤ 2) , P (X ≤ 3) , P (X ≤ 4) ? 6 . 8 X(s) = the number of Heads , QUESTION : What are the values of we have with S = {HHH , HHT , HT H , HT T , T HH , T HT , T T H , T T T } , EXAMPLE : For the sample space Thus and Value-ranges of a random variable correspond to events in S , P ( {HT T , T HT , T T H} ) P ( {HHT , HT H , T HH} ) P ( {HHH} ) pX (1) ≡ pX (2) ≡ pX (3) ≡ 74 pX (0) + pX (1) + pX (2) + pX (3) = 1 . where P ( {T T T } ) X(s) = the number of Heads , pX (0) ≡ we have with = = = = ( Why ? ) 1 8 3 8 3 8 1 8 S = {HHH , HHT , HT H , HT T , T HH , T HT , T T H , T T T } , EXAMPLE : For the sample space NOTATION : We will also write pX (x) to denote P (X = x) . E0 E1 TTT THH HTH HHT TTH THT HTT Graphical representation of X . E2 HHH 0 1 2 3 X(s) 75 (X : S → R must be defined for all s ∈ S and must be single-valued.) The events E0 , E1 , E2 , E3 are disjoint since X(s) is a function ! S E3 76 The graph of pX . pX (x) ≡ P (X = x) , FX (x) ≡ P (X ≤ x) , ( Why ? ) ( Why ? ) • FX (−∞) = 0 and FX (∞) = 1 . • P (a < X ≤ b) = FX (b) − FX (a) . p(x) for pX (x) 77 and F (x) for FX (x) . NOTATION : When it is clear what X is then we also write ( Why ? ) • FX (x) is a non-decreasing function of x . PROPERTIES : is called the (cumulative) probability distribution function . DEFINITION : is called the probability mass function . DEFINITION : 1 8 , p(1) = 3 8 , p(2) = ≡ ≡ ≡ ≡ ≡ F ( 0) F ( 1) F ( 2) F ( 3) F ( 4) P (X ≤ 4) P (X ≤ 3) P (X ≤ 2) P (X ≤ 1) P (X ≤ 0) P (X ≤ −1) 3 8 = = = = = = , 78 = F (2) − F (0) = 7 8 − P (0 < X ≤ 2) = P (X = 1) + P (X = 2) We see, for example, that ≡ F (−1) we have the probability distribution function p(0) = 1 8 = 1 1 1 8 4 8 7 8 0 p(3) = 6 8 1 8 . , S = {HHH , HHT , HT H , HT T , T HH , T HT , T T H , T T T } , EXAMPLE : With X(s) = the number of Heads , and 79 The graph of the probability distribution function FX . = {H , T H , T T H , T T T H , · · · } . 1 2 , p(2) = n→∞ k=1 1 lim (1 − n ) = 1 . n→∞ 2 1 = 1 − n , 2 80 EXERCISE : Draw the probability mass and distribution functions. NOTE : The outcomes in S do not have equal probability ! k=1 , ··· ( Why ? ) X(T T H) = 3 , p(3) = 18 , · · · n n 1 p(k) = k 2 k=1 k=1 and, as should be the case, ∞ n p(k) = lim p(k) = 1 4 X(T H) = 2 , and F (n) = P (X ≤ n) = p(1) = X(H) = 1 , Then The random variable X is the number of rolls until ”Heads” occurs : S Then the sample space is countably infinite , namely, EXAMPLE : Toss a coin until ”Heads” occurs. T H̃HT , T T H̃T , T H̃HH , T T H̃H , T T T H̃ , T H̃T H , H̃T T H , TTTT T H̃T T , H̃T T T , 1 2 , pX (2) = independent of n . pX (1) = 1 4 , 81 pX (3) = 1 8 , pX (4) = ··· , , and }. 1 ) 16 1 16 Each outcome in S4 has equal probability 2−n (here 2−4 = where for each outcome the first ”Heads” is marked as H̃ . H̃T HT , H̃T HH , S4 = { H̃HHH , H̃HHT , H̃HT H , H̃HT T , REMARK : We can also take S ≡ Sn as all ordered outcomes of length n. For example, for n = 4, X(s) is the number of tosses until ”Heads” occurs · · · For example, (0 for T T T ) . = 2 8 82 = 1 4 . = P ( 2 Heads , 1st toss is Heads) pX,Y (2, 1) = P (X = 2 , Y = 1) pX,Y (x, y) = P (X = x , Y = y) . Then we have the joint probability mass function X(s) = # Heads , Y (s) = index of the first H we let S = {HHH , HHT , HT H , HT T , T HH , T HT , T T H , T T T } , EXAMPLE : Toss a coin 3 times in sequence. For the sample space The probability mass function and the probability distribution function can also be functions of more than one variable. Joint distributions For 1 8 0 0 0 1 8 2 8 1 8 4 8 0 1 8 y=3 1 8 0 2 8 1 8 0 0 0 1 8 1 8 0 y=2 pX (x) 1 1 8 3 8 3 8 1 8 83 • The marginal probability pY is the probability mass function of Y . • The marginal probability pX is the probability mass function of X. NOTE : x=0 x=1 x=2 x=3 pY (y) y=1 y=0 Joint probability mass function pX,Y (x, y) we can list the values of pX,Y (x, y) : X(s) = number of Heads, and Y (s) = index of the first H , S = {HHH , HHT , HT H , HT T , T HH , T HT , T T H , T T T } , EXAMPLE : ( continued · · · ) 1 8 0 0 0 1 8 2 8 1 8 4 8 0 1 8 2 8 0 1 8 1 8 0 y=2 1 8 0 0 1 8 0 y=3 1 1 8 3 8 3 8 1 8 pX (x) 84 QUESTION : Are the events X = 2 and Y = 1 independent ? • (X = 2 and Y = 1) corresponds to the event {HHT , HT H} . • Y = 1 corresponds to the event {HHH , HHT , HT H , HT T } . • X = 2 corresponds to the event {HHT , HT H , T HH} . For example, x=0 x=1 x=2 x=3 pY (y) y=1 y=0 X(s) = number of Heads, and Y (s) = index of the first H . EXAMPLE : ( continued · · · ) TTH E 00 HHT HTH TTT E 22 THH E 21 E12 THT E11 HTT E 13 HHH 0 2 1 3 85 QUESTION : Are the events X = 2 and Y = 1 independent ? The events Ei,j ≡ { s ∈ S : X(s) = i , Y (s) = j } are disjoint . S E 31 X(s) Y(s) ≡ P (X = x , Y = y) , ≡ P (X ≤ x , Y ≤ y) , and 86 F (x, y) for FX,Y (x, y) . p(x, y) for pX,Y (x, y) , NOTATION : When it is clear what X and Y are then we also write is called the joint (cumulative) probability distribution function . FX,Y (x, y) DEFINITION : is called the joint probability mass function . pX,Y (x, y) DEFINITION : 1 8 0 0 0 1 8 2 8 1 8 4 8 0 1 8 1 8 0 2 8 1 8 0 y=3 0 0 1 8 1 8 0 y=2 1 1 8 3 8 3 8 1 8 pX (x) 1 8 2 8 4 8 5 8 5 8 1 8 1 8 1 8 1 8 1 8 1 8 3 8 6 8 7 8 7 8 y=2 1 1 1 1 8 4 8 7 8 FX (·) 1 1 8 4 8 7 8 y=3 87 Note that the distribution function FX is a copy of the 4th column, and the distribution function FY is a copy of the 4th row. ( Why ? ) x=0 x=1 x=2 x=3 FY (·) y=1 y=0 Joint distribution function FX,Y (x, y) ≡ P (X ≤ x, Y ≤ y) x=0 x=1 x=2 x=3 pY (y) y=1 y=0 Joint probability mass function pX,Y (x, y) EXAMPLE : Three tosses : X(s) = # Heads, Y (s) = index 1st H . y=2 y=3 pX (x) 1 8 0 0 0 1 8 2 8 1 8 4 8 0 1 8 2 8 0 1 8 1 8 0 1 8 0 0 1 8 0 1 1 8 3 8 3 8 1 8 1 8 3 8 6 8 7 8 7 8 1 8 2 8 4 8 5 8 5 8 1 8 1 8 1 8 1 8 1 8 1 1 1 1 8 4 8 7 8 FX (·) 1 1 8 4 8 7 8 y=3 88 P (1 < X ≤ 3 , 1 < Y ≤ 3) = F (3, 3) − F (1, 3) − F (3, 1) + F (1, 1) ? QUESTION : Why is x=0 x=1 x=2 x=3 FY (·) y=2 y=1 y=0 Joint distribution function FX,Y (x, y) ≡ P (X ≤ x, Y ≤ y) x=0 x=1 x=2 x=3 pY (y) y=1 y=0 Joint probability mass function pX,Y (x, y) In the preceding example : , Y = sum of the two rolls. 89 • List the values of the joint cumulative distribution function FX,Y (x, y) . • List the values of the joint probability mass function pX,Y (x, y) . • How many outcomes are there in S ? • What is a good choice of the sample space S ? X = result of the first roll Define the random variables X and Y as Suppose the outcome of a single roll is the side that faces down ( ! ). Suppose each of the four sides is equally likely to end facing down. (The sides are marked 1 , 2 , 3 , 4 .) Roll a four-sided die (tetrahedron) two times. EXERCISE : G(s) = the number of green balls drawn . 90 • the marginal distribution functions FR (r) and FG (g) . • the joint distribution function FR,G (r, g) . • the marginal probability mass functions pR (r) and pG (g) . • the joint probability mass function pR,G (r, g) . List the values of and R(s) = the number of red balls drawn, Define the random variables 2 red , 3 green , 4 blue balls . Three balls are selected at random from a bag containing EXERCISE : for all x and y , for all x and y , = P (Ex ) · P (Ey ) , for all x and y . • 91 X −1 ({x}) ≡ { s ∈ S : X(s) = x } . • In the current discrete case, x and y are typically integers . NOTE : P (Ex Ey ) are independent in the sample space S , i.e., Ex ≡ X −1 ({x}) and Ey ≡ Y −1 ({y}) , or, equivalently, if the events pX,Y (x, y) = pX (x) · pY (y) , or, equivalently, if their probability mass functions satisfy P (X = x , Y = y) = P (X = x) · P (Y = y) , Two discrete random variables X(s) and Y (s) are independent if Independent random variables TTH E 00 HHT HTH TTT E 22 THH E 21 E12 THT E11 HTT E 13 HHH 0 2 1 3 Are X and Y independent ? • 92 What are the values of pX (2) , pY (1) , pX,Y (2, 1) ? • Three tosses : X(s) = # Heads, Y (s) = index 1st H . S E 31 X(s) Y(s) pX,Y (k, ) = pX (k) · pY (), 93 for all 1 ≤ k, ≤ 6 ? Y the result of the 2nd roll . X be the result of the 1st roll , Are X and Y independent , i.e., is and Let Roll a die two times in a row. EXERCISE : pX,Y (x, y) = pX (x) · pY (y) . X(s) and Y (s) are independent if for all x and y : RECALL : x=0 x=1 x=2 x=3 pY (y) 1 8 0 0 0 1 8 1 8 0 2 8 1 8 1 8 1 8 1 8 2 8 1 8 4 8 0 0 0 0 0 94 y=0 y=1 y=2 y=3 1 1 8 3 8 3 8 1 8 pX (x) Joint probability mass function pX,Y (x, y) Are these random variables X and Y independent ? EXERCISE : 95 QUESTION : Is FX,Y (x, y) = FX (x) · FY (y) ? Joint distribution function FX,Y (x, y) ≡ P (X ≤ x, Y ≤ y) y = 1 y = 2 y = 3 FX (x) 1 1 5 1 x=1 3 12 2 2 5 5 25 5 x=2 9 36 6 6 2 5 x=3 1 1 3 6 5 2 FY (y) 1 1 3 6 Joint probability mass function pX,Y (x, y) y = 1 y = 2 y = 3 pX (x) 1 1 1 1 x=1 3 12 12 2 1 1 2 1 x=2 9 18 18 3 1 1 1 1 x=3 9 36 36 6 2 1 1 pY (y) 1 3 6 6 EXERCISE : Are these random variables X and Y independent ? i≤k i≤k i≤k 96 FX (xk ) · FY (y ) . = i≤k j≤ pX (xi ) } · { j≤ pY (yj ) } pY (yj ) } (by independence) for all x, y . pX (xi ) · pY (yj ) pX,Y (xi , yj ) { pX (xi ) · j≤ j≤ { P (X ≤ xk , Y ≤ y ) = = = = FX,Y (xk , y ) = PROOF : FX,Y (x, y) = FX (x) · FY (y) , The joint distribution function of independent random variables X and Y satisfies PROPERTY : and Ey = Y −1 ({y}) , P (Ex |Ey ) ≡ P (Ex Ey ) P (Ey ) = pX,Y (x, y) . pY (y) 97 Thus it is natural to define the conditional probability mass function pX,Y (x, y) pX|Y (x|y) ≡ P (X = x | Y = y) = . pY (y) Then be their corresponding events in the sample space S. Ex = X −1 ({x}) For given x and y , let Let X and Y be discrete random variables with joint probability mass function pX,Y (x, y) . Conditional distributions TTH E 00 HHT HTH TTT E 22 THH E 21 E12 THT E11 HTT E 13 HHH 0 2 1 3 98 • What are the values of P (X = 2 | Y = 1) and P (Y = 1 | X = 2) ? Three tosses : X(s) = # Heads, Y (s) = index 1st H . S E 31 X(s) Y(s) 1 1 1 99 . pX,Y (x,y) pX (x) pX,Y (x,y) pY (y) EXERCISE : Also construct the Table for pY |X (y|x) = 1 Conditional probability mass function pX|Y (x|y) = y=0 y=1 y=2 y=3 x=0 1 0 0 0 2 4 x=1 0 1 8 8 4 4 x=2 0 0 8 8 2 x=3 0 0 0 8 Joint probability mass function pX,Y (x, y) y = 0 y = 1 y = 2 y = 3 pX (x) 1 1 x=0 0 0 0 8 8 1 3 1 1 x=1 0 8 8 8 8 1 2 3 x=2 0 0 8 8 8 1 1 x=3 0 0 0 8 8 1 4 2 1 pY (y) 1 8 8 8 8 . EXAMPLE : (3 tosses : X(s) = # Heads, Y (s) = index 1st H.) 1 1 pX,Y (x,y) pY (y) . 100 EXERCISE : Also construct the Table for P (Y = y|X = x) . QUESTION : What does the last Table tell us? 1 Conditional probability mass function pX|Y (x|y) = y=1 y=2 y=3 1 1 1 x=1 2 2 2 1 1 1 x=2 3 3 3 1 1 1 x=3 6 6 6 EXAMPLE : Joint probability mass function pX,Y (x, y) y = 1 y = 2 y = 3 pX (x) 1 1 1 1 x=1 3 12 12 2 1 1 2 1 x=2 9 18 18 3 1 1 1 1 x=3 9 36 36 6 2 1 1 pY (y) 1 3 6 6 ≡ k xk · P (X = xk ) = k xk · pX (xk ) . E[aX + b] = a E[X] + b . • 101 E[aX] = a E[X] , • EXERCISE : Prove the following : EXAMPLE : The expected value of rolling a die is 6 1 1 1 1 E[X] = 1 · 6 + 2 · 6 + · · · + 6 · 6 = 6 · k=1 k = ( E[X] is also called the mean of X .) Thus E[X] represents the weighted average value of X . E[X] The expected value of a discrete random variable X is Expectation 7 2 . = {H , T H , T T H , T T T H , · · · } . X(T H) = 2 , n k lim k n→∞ 2 k=1 X(T T H) = 3 . 102 Perhaps using Sn = {all sequences of n tosses} is better · · · REMARK : 1 1 1 E[X] = 1 · + 2 · + 3 · + · · · = 2 4 8 n k n k/2 k=1 1 0.50000000 2 1.00000000 3 1.37500000 10 1.98828125 40 1.99999999 Then X(H) = 1 , = 2. The random variable X is the number of tosses until ”Heads” occurs : S EXAMPLE : Toss a coin until ”Heads” occurs. Then k g(xk ) p(xk ) . E[g(X)] = k=1 6 1 k 6 2 = 103 1 6(6 + 1)(2 · 6 + 1) 6 6 SOLUTION : Here g(X) = X 2 , and = 91 6 ∼ = $15.17 . What should the entry fee be for the betting to break even? The pay-off of rolling a die is $k2 , where k is the side facing up. EXAMPLE : E[g(X)] ≡ The expected value of a function of a random variable is 1 2 2 3 1 3 2 9 1 9 = 1· E[XY ] = 1 · + 2· + 3· E[Y ] 1 3 2 9 1 9 2 3 = 1· x=1 x=2 x=3 pY (y) 104 + 6· + 4· 1 36 1 18 1 12 + 2· + 2· 1 3 1 6 + 2· 1 12 1 18 1 36 1 6 1 6 1 6 + 9· + 6· + 3· + 3· + 3· 1 12 1 18 1 36 1 6 1 36 1 18 1 12 y=1 y=2 y=3 E[X] EXAMPLE : k = = = 1 1 2 1 3 1 6 5 2 5 3 3 2 pX (x) , . , ( So ? ) The expected value of a function of two random variables is E[g(X, Y )] ≡ g(xk , y ) p(xk , y ) . k xk pX (xk )} · { 105 (by independence) y pY (y )} y pY (y )} xk y pX (xk ) pY (y ) xk pX (xk ) xk y pX,Y (xk , y ) = E[X] · E[Y ] . k k{ k = { = = = EXAMPLE : See the preceding example ! E[XY ] If X and Y are independent then E[XY ] = E[X] E[Y ] . PROOF : • PROPERTY : k {xk {y pX,Y (xk , y )} + pX (xk )} + xk pX,Y (xk , y ) + xk pX,Y (xk , y ) + 106 k { y pY (y )} k pX,Y (xk , y )} y pX,Y (xk , y ) y pX,Y (xk , y ) ( Always ! ) k (xk + y ) pX,Y (xk , y ) k {xk k k k = E[X] + E[Y ] . = = = = = NOTE : X and Y need not be independent ! E[X + Y ] PROOF : PROPERTY : E[X + Y ] = E[X] + E[Y ] . E[XY ] = E[X] E[Y ] , 107 then it does not necessarily follow that X and Y are independent ! Thus if X and Y are not independent E[XY ] = 16 • E[Y ] = 8 , E[X] = 2 , • Show that Probability mass function pX,Y (x, y) y = 6 y = 8 y = 10 pX (x) 1 1 2 x=1 0 5 5 5 1 1 x=2 0 0 5 5 1 1 2 x=3 0 5 5 5 2 1 2 pY (y) 1 5 5 5 EXERCISE : k (xk − μ)2 p(xk ) , We have V ar(X) 108 = E[X 2 ] − μ2 . = E[X 2 ] − 2μ2 + μ2 = E[X 2 ] − 2μE[X] + μ2 = E[X 2 − 2μX + μ2 ] which is the average weighted square distance from the mean. V ar(X) ≡ E[ (X − μ) ] ≡ 2 μ = E[X] . Then the variance of X is Let X have mean Variance and Standard Deviation V ar(X) = E[ (X − μ)2 ] = E[X 2 ] − μ2 . σ k=1 [k2 · 1 ] − μ2 6 = 109 35 12 ∼ = 1.70 . 7 2 1 6(6 + 1)(2 · 6 + 1) 35 − ( ) = . = 6 6 2 12 The standard deviation is V ar(X) = 6 EXAMPLE : The variance of rolling a die is which is the average weighted distance from the mean. σ(X) ≡ The standard deviation of X is , E[Y ] = μY . Cov(X, Y ) We have 110 = E[XY ] − E[X] E[Y ] . = E[XY ] − μX μY − μY μX + μX μY = E[XY − μX Y − μY X + μX μY ] = E[ (X − μX ) (Y − μY ) ] k, Then the covariance of X and Y is defined as Cov(X, Y ) ≡ E[ (X−μX ) (Y −μY ) ] = (xk −μX ) (y −μY ) p(xk , y ) . E[X] = μX Let X and Y be random variables with mean Covariance 111 Cov(X, Y ) < 0 . • If X > μX when Y < μY and X < μX when Y > μY then Cov(X, Y ) > 0 . • If X > μX when Y > μY and X < μX when Y < μY then Cov(X, Y ) measures ”concordance ” or ”coherence ” of X and Y : NOTE : k, (xk − μX ) (y − μY ) p(xk , y ) = E[XY ] − E[X] E[Y ] . = Cov(X, Y ) ≡ E[ (X − μX ) (Y − μY ) ] We defined 112 • V ar(X + Y ) = V ar(X) + V ar(Y ) + 2 Cov(X, Y ) . • Cov(X + Y, Z) = Cov(X, Z) + Cov(Y, Z) , • Cov(X, cY ) = c Cov(X, Y ) , • Cov(cX, Y ) = c Cov(X, Y ) , • Cov(X, Y ) = Cov(Y, X) , • V ar(aX + b) = a2 V ar(X) , EXERCISE : Prove the following : from which the result follows. 113 E[XY ] = E[X] E[Y ] . and that if X and Y are independent then Cov(X, Y ) ≡ E[ (X − μX ) (Y − μY ) ] = E[XY ] − E[X] E[Y ] , We have already shown ( with μX ≡ E[X] and μY ≡ E[Y ] ) that PROOF : If X and Y are independent then Cov(X, Y ) = 0 . PROPERTY : 114 then it does not necessarily follow that X and Y are independent ! Cov(X, Y ) = 0 , X and Y are not independent • Thus if Cov(X, Y ) = E[XY ] − E[X] E[Y ] = 0 E[XY ] = 16 • E[Y ] = 8 , E[X] = 2 , • Show that Probability mass function pX,Y (x, y) y = 6 y = 8 y = 10 pX (x) 1 1 2 x=1 0 5 5 5 1 1 x=2 0 0 5 5 1 1 2 x=3 0 5 5 5 2 1 2 pY (y) 1 5 5 5 EXERCISE : ( already used earlier · · · ) from which the result follows. 115 Cov(X, Y ) = 0 , and that if X and Y are independent then V ar(X + Y ) = V ar(X) + V ar(Y ) + 2 Cov(X, Y ) , We have already shown (in an exercise !) that PROOF : V ar(X + Y ) = V ar(X) + V ar(Y ) . If X and Y are independent then PROPERTY : for Compute Cov(X, Y ) E[XY ] , V ar(X) , V ar(Y ) E[X] , E[Y ] , E[X 2 ] , E[Y 2 ] 116 Joint probability mass function pX,Y (x, y) y = 0 y = 1 y = 2 y = 3 pX (x) 1 1 x=0 0 0 0 8 8 1 3 1 1 x=1 0 8 8 8 8 1 2 3 x=2 0 0 8 8 8 1 1 x=3 0 0 0 8 8 4 2 1 1 pY (y) 1 8 8 8 8 EXERCISE : for Compute Cov(X, Y ) E[XY ] , V ar(X) , V ar(Y ) E[X] , E[Y ] , E[X 2 ] , E[Y 2 ] x=1 x=2 x=3 pY (y) 1 3 2 9 1 9 2 3 117 1 12 1 18 1 36 1 6 1 12 1 18 1 36 1 6 y=1 y=2 y=3 1 1 2 1 3 1 6 pX (x) Joint probability mass function pX,Y (x, y) EXERCISE :