How to calculate Joint, Marginal, and Conditional Probabilities using excel Dcarroll

How to: 8 steps to Calculate, Joint, Marginal, and Conditional Distribution using Microsoft Excel by Dereka Carroll

 

Step one: Create a plot comparing the forecasts (x-axis) and observations (y-axis). This is done by clicking the “create chart” symbol, then choosing the scatter plot option, click “next,” then click on the “series” tab to select data for each axis.

 

 

Step Two: Create a contingency table of the frequency of forecasts and observations in the suggested categories. In the example below you have a 4x 4 contingency table with the categories 10-19, 20-29, 30-39, 40-49 for forecast (x-axis; denoted (f)) and observed (y-axis; denoted (x)) temperature. Using the data you were provided, record how many observed temperatures between 10-19 was forecast 10-19 or 20-29 etc. For example a temperature between 10-19 was forecast once when a temperature between 10-19 was observed, so you put 1 down for this category, Example:

                  

Contingency chart of observed and forecast temperature values

           C                D             E             F

 

10-19

20-29

30-39

40-49

10-19

1

0

0

0

20-29

0

7

3

0

30-39

0

10

12

5

40-49

0

0

6

36

                                    

                       Note: From here on forecasts (f) are on the x-axis and observations (x) are on the y-axis

 

Step three: Using this same contingency table calculate the joint distribution of the forecasts and observations. This can be simplified by typing the equation in a new 4x 4 contingency table. To insert equation select the cell press “=” then type in equation; for simplicity after “=,” select the cell number from the previous contingency table followed by the “/” and then the total number of forecasts and observations. i.e. =C1/196).  

 

 

Joint distribution of forecasts and observations p(f,x)

 

10-19

20-29

30-39

40-49

10-19

0.005102

0

0

0

20-29

0

0.035714

0.015306

0

30-39

0

0.05102

0.061224

0.02551

40-49

0

0

0.030612

0.183673

                                                                                                                  

Step four: Calculate the marginal distribution for forecasts. This is done by manually counting the frequencies of forecasts in each category, example:

 

 

 

General count of forecast temperature

 

             H                 I                 J                  K

 

10-19

20-29

30-39

40-49

Forecasts (f)

1

17

21

45

 

then, using the marginal distribution equation insert equation by selecting the cell and press “=” then type in equation; for simplicity after “=,” select the cell number from the previous contingency table followed by the “/” and then the total number of forecasts i.e. =H1/196 but using the general count for forecast temperature above. example:

 

Marginal distribution of forecasts p(f)

                                                            10-19

20-29

30-39

    40-49

                                                               0.005102

0.0867347

0.10714286

0.22959184

 

Step five: Repeat step four, but for observation data.

 

Step six: Factor the joint distribution and compute the conditional probability of the forecast given the observation. This equation is different. Conditional probability is the joint probability divided by the marginal probability. In this case, you will divide the joint probabilities created in Step three, and divide by the marginal probabilities of observations that you created in step 5.  Insert equation by selecting the cell and press “=” then type in equation; for simplicity after “=,” select the cell number from the joint distribution contingency table followed by the “/” and then the marginal distribution of that category for observations i.e. =L2/ P1.

Example:

These values:

Joint distribution of forecasts and observations p(f,x)

                       L                M             N                O

           

10-19

20-29

30-39

40-49

10-19

0.005102

0

0

0

20-29

0

0.035714

0.015306

0

30-39

0

0.05102

0.061224

0.02551

40-49

0

0

0.030612

0.18367

 

Dived by these values:

 

              Marginal distribution of observations p(x)             

P                  Q                     R                     S

      10-19

      20-29

      30-39

        40-49

0.005102

0.0510204

0.1377551

0.26530612

 

Will give you the conditional probability of f given x:

 

Conditional distribution of forecast temperature p(f|x)

 

 

10 -19

20-29

30-39

40-49

10 to 19

1

0

0

0

20-29

0

0.7

0.11111111

0

30-39

0

1

0.44444444

0.0961538

40-49

0

0

0.22222222

0.6923077

 

Note: The entire column of 10-19 categories should be divided by the corresponding marginal distribution of the 10-19 categories.

 

Step seven: Factor the joint distribution and compute the conditional probability of the observations given the forecast. See step six, except you will divide by the marginal probability of f.

 

Step eight: Compute the measure of forecast bias by subtracting the average observed value from the average forecast value. This is fairly simple to do. First highlight the column of forecasts or observations and click the down arrow on “∑” symbol on the tool bar and go down to “average.” This will print the average at the end of that row. Repeat the step for observations or forecasts, whichever you did not use initially. After getting both averages, click a free cell and put “=” then click on the cells with the calculated “avg. observation” then “minus sign“ “avg. forecast.” Ex. =avgx#-avgf#. This is your forecast bias.

 


Comments