Sophisticated data analysis will help you spot patterns, trends and relationships in your results. Data analysis can be qualitative and/or quantitative, and may include statistical tests. An example of a statistical test is outlined below.

## Recurrence interval

Reports of flooding sometimes use the phrase “a 100-year flood”. This does not mean that such floods only happen once every 100 years. Instead these terms mean that, on the basis of past records, the probability is that such a flood will only happen once in any given 100 years.

A recurrence interval how often a river is expected to reach a particular level of flow.

$$\mathsf{recurrence\;interval = \frac{(N+1)}{M}}$$
• $$N$$ = number of years for which data has been collected
• $$M$$ = rank (known as the magnitude number)

### Worked example

Peak flow data from 1991-92 to 2013-14 was obtained for the River Thames at Eynsham, Oxfordshire from the National River Flow Archive.

Hydrological yearPeak flow (cumecs)
1991-199233.07
1992-199381.635
1993-199478.484
1994-199579.532
1995-199674.867
1996-199740.358
1997-199872.413
1998-199983.066
1999-200077.624
2000-200191.572
2001-200262.028
2002-200391.796
2003-200455
2004-200550.9
2005-200649
2006-2007102.054
2007-200887.587
2008-200975.795
2009-201060.135
2010-201151.896
2011-201266.552
2012-201397.989
2013-2014107.355

(a) Rank the peak flow column from highest (1) to lowest (23)

Hydrological yearPeak flow (cumecs)Rank (magnitude number)
1991-199233.0723
1992-199381.6358
1993-199478.48410
1994-199579.5329
1995-199674.86713
1996-199740.35822
1997-199872.41314
1998-199983.0667
1999-200077.62411
2000-200191.5725
2001-200262.02816
2002-200391.7964
2003-20045518
2004-200550.920
2005-20064921
2006-2007102.0542
2007-200887.5876
2008-200975.79512
2009-201060.13517
2010-201151.89619
2011-201266.55215
2012-201397.9893
2013-2014107.3551

(b) Calculate the recurrence interval for each peak flow

$$\mathsf{recurrence\;interval = \frac{(N+1)}{M}}$$
Hydrological yearPeak flow (cumecs)Rank (magnitude number)Recurrence interval
1991-199233.07231.04
1992-199381.63583.00
1993-199478.484102.40
1994-199579.53292.67
1995-199674.867131.85
1996-199740.358221.09
1997-199872.413141.71
1998-199983.06673.43
1999-200077.624112.18
2000-200191.57254.80
2001-200262.028161.50
2002-200391.79646.00
2003-200455181.33
2004-200550.9201.20
2005-200649211.14
2006-2007102.054212.00
2007-200887.58764.00
2008-200975.795122.00
2009-201060.135171.41
2010-201151.896191.26
2011-201266.552151.60
2012-201397.98938.00
2013-2014107.355124.00

## Flood frequency curve

This is a graph of river flow on the y-axis plotted against recurrence inteval on the x-axis. The x-axis is plotted on a logarithmic scale.

For example, peak flow data from 1991-92 to 2013-14 was obtained for the River Thames at Eynsham, Oxfordshire from the National River Flow Archive.

## Chi-squared test

Chi squared in a statistical test that is used either to test whether there is a significant difference, goodness of fit or an association between observed and expected values.

$$\chi^2 = ∑ \frac{(O-E)^2 }{E}$$

The chi squared test can only be used if

• the data are in the form of frequencies in a number of categories (i.e. nominal data).
• there are more than 20 observations in total
• the observations are independent: one observation does not affect another

There are 3 steps to take when using the chi squared test

### Step 1. State the null hypothesis

There is no significant association between _______ and _______

### Step 2. Calculate the chi squared statistic

$$\chi^2 = ∑ \frac{(O-E)^2}{E}$$
• $$\chi^2$$ = chi squared statistic
• $$O$$ = Observed values
• $$E$$ = Expected values

### Step 3. Test the significance of the result

Compare your calculated value of $$\chi^2$$ against the critical value for $$\chi^2$$ at a confidence level of 95% / significance value of P = 0.05, and appropriate degrees of freedom.

$$\mathsf{Degrees\;of\;freedom = (number\;of\;rows\;– 1) \times (number\;of\;columns\;– 1)}$$

If Chi Squared is equal to or greater than the critical value REJECT the null hypothesis. There is a SIGNIFICANT difference between the observed and expected values.

If Chi Squared is less than the critical value, ACCEPT the null hypothesis. There is NO SIGNIFICANT difference between the observed and expected values.

### Worked example

A Geography student is investigating whether people’s previous experience of floods has affected their preparedness for future floods. She surveyed householders in a coastal town, taking a stratified sample from householders that were flooded in winter 2013/4 and householders that were not flooded. One of the questions she asked was

Here are the results. Geographers call them the Observed Values.

Strongly agree5813
Agree71623
Neither5510
Disagree6511
Strongly disagree213
SUM253560

### Step 1. State the null hypothesis

There is no significant association between past experience of flooding and preparedness for floods in the future.

### Step 2. Calculate the chi squared statistic

It is best to break this down into a number of smaller steps.

(a) Calculate the Expected Values using the formula

$$\mathsf{Expected\;value = \frac{(row\;total\;\times \;column\;total)}{grand\;total}}$$
OEOE
Strongly agree55.487.6
Agree79.61613.4
Neither54.255.8
Disagree64.656.4
Strongly disagree21.311.8

(b) Calculate $$(O-E)$$ and $$(O-E)^2$$and $$(O-E)^2/E$$

Observed ($$O$$) and Expected ($$E$$) values have been copied from the table above.

Flooded
$$O$$$$E$$$$(O-E)$$$$(O-E)^2$$$$(O-E)^2/E$$
55.4-0.40.160.03
79.6-2.66.760.70
54.20.80.641.15
64.61.41.960.43
21.30.70.490.38
Not flooded
$$O$$$$E$$$$(O-E)$$$$(O-E)^2$$$$(O-E)^2/E$$
87.60.40.160.02
1613.42.66.760.50
55.8-0.80.640.11
56.4-1.41.960.31
11.8-0.80.640.36

(c) Find the sum of the $$(O-E)^2/E$$ column

$$\chi^2 = ∑ \frac{(O-E)^2}{E}$$ $$\chi^2 = 0.03+0.70+1.15+0.43+0.38+0.02+0.50+0.11+0.31+0.36$$ $$\chi^2 = 3.99$$

### Step 3. Test the significance of the result

Calculate degrees of freedom

$$\mathsf{Degrees\;of\;freedom = (number\;of\;rows\;– 1) \times (number\;of\;columns\;– 1)}$$

In this example $$\mathsf{Degrees\;of\;freedom} = (5-1) \times (2-1) = 4$$

Choose a significance level, e.g. $$p=0.05$$. This means that chance should only account for the results in up to 5% of occasions the field test is carried out.

Compare the result with the critical value in the table. If the calculated value is greater than the critical value in the table the null hypothesis must be rejected.

At 3 degrees of freedom at $$p=0.05[latex], the critical value is [latex]9.49$$

Since our calculated value of 3.99 > 9.49,

There is no significant association between past experience of flooding and preparedness for floods in the future.

## Secondary and Further Education Courses

Set your students up for success with our secondary school trips and courses. Offering excellent first hand experiences for your students, all linked to the curriculum.