The concept of discrimination
|
|
When looking at reliability tables, the sample was always divided into bins according to the forecast value.
Now, suppose we were to divide the verification sample into bins according to the observation.
Since there are only two possible observation values – "the event occurred", represented by 1, and "the event didn’t occur",
represented by 0, this means the sample is divided into two groups, "occurrences" and "non-occurrences".
Then the distribution of the probability forecast values for each group can be plotted and compared.
These are called conditional distributions, and the graph of the two distributions is sometimes called a likelihood diagram.
One would hope that higher forecast probabilities would tend to be associated with occurrences of the event and
lower forecast probabilities would be associated with non-occurrences.
If so, then the two conditional distributions would be well-separated with minimum overlap.
The separation of these distributions is a measure of discrimination, and is an indicator of how useful the
probability forecasts are for making decisions.
It is important to note here that discrimination is completely separate from reliability;
reliable probability forecasts can be non-discriminating, and vice-versa.
That is why both assessments are needed for a full evaluation of a probability forecast system.
Now, check your understanding of discrimination via the following exercise.
Below are two histogram graphs for 24 h and 48 h POP forecasts for Tampere Finland.
Each graph shows the two conditional distributions of forecast probabilities given the occurrence or non-occurrence of precipitation.
Answer the following questions about these graphs:
Loading Questions
...
1. At 24 h, there were more non-occurrences than occurrences of precipitation
when the forecaster predicted a probability of 0.7. True or false?
Correct. According to the left hand graph, there are 18 non-occurrences of precipitation and
only 16 occurrences of precipitation associated with forecasts of 0.7
2. Which of the 24 h and 48 h forecasts show evidence of discrimination between
rain and no rain events?
Correct. Although there is overlap of the two distributions in both cases,
it is clear that the means are quite well-separated.
3. For which forecasts is discrimination the best? What evidence do you see?
Correct. The main evidence is the much higher number of forecasts of 0 preceding non-occurrences at 24 h,
the greater number of non-occurrences following forecasts of 0.8 at 48 h,
and the greater number of occurrences following low probability forecasts at 48 h.