Glossary
- Absolute valuesearch for term
For all real numbers a, |a| (pronounced ‘absolute value of a’), is the positive magnitude of a.
So |6|=6, |100|=100, |-6|=6, |-100|=100
So |a|=a if a is positive, |a|=-a if a is negative, |a|=0 if a=0.
So |3|=3 and |-3|=3An intuitive way of looking at absolute value is to consider that the absolute value of a real number is its distance from zero on the number line. For example:
- Additionsearch for term
- Whole numbers: Addition is an operation of composition. On whole numbers, addition may be described as the joining of disjoint sets. In a physical model it is represented by the bringing together of objects. Initially it involves counting the objects in the joined set to determine the sum, or result, of the operation. When the basic addition facts are known, more complex addition problems can be answered using additive strategies.
Addition is a binary operation, that is, it is an operation on two numbers.
Addition is commutative, that is, the order of the numbers does not change the answer. For example, 4+5 = 5+4.
Addition is associative, that is, the grouping of the numbers does not affect the answer. For example, (2+3)+5 = 2+(3+5).
Zero is the identity element for addition, because the addition of zero to a number does not change it.
- Fractions: Fractions may be added. If their denominators are the same then we can simply add the numerators to obtain the sum. For example, 2/9 + 4/9 = 6/9. If their denominators are not the same then we must choose equivalent fractions so that their denominators are the same. For example, to add 1/6 and 1/4 we must find equivalent fractions for 1/6 and 1/4 that have a common denominator. We could multiply the two denominators, 6 and 4, and that process would always give us a common denominator. However, we might observe that the least common multiple of 6 and 4 is actually 12.
1/6 = 2/12, 1/4 = 3/12, so 1/6 + 1/4 = 5/12. - Decimals: Decimal fractions (commonly called decimals) may be added in the same way that whole numbers are, with care being taken to consider the position of the digits in the decimal. So, for example, tenths are added to tenths, hundredths to hundredths, etc. We can add the decimals because they represent fractions whose denominators are the powers of 10. For example, 2.4 = 2 + 4/10 and 3.56 = 3 + 5/10 + 6/100 so the 4/10 can be added to the 5/10 because they already have the common denominator of 10.
- Percentages: Percentages may be added as if they were whole numbers or decimals. For example, 8% + 23% = 31%, 2.6% + 3.1 % + 120% = 125.7%. At an abstract level, a percentage is a numeral representing a number and therefore, just like decimals, they may be added. Care must be taken however because of the way that society uses percentages. One often refers to a percentage of something and that can lead to difficulties. For example, although it is true that 5% of 80 plus 10% of 80 is 15% of 80, it is not true that 5% of 80 plus 10% of 60 is 15% of either 80 or 60.
- Integers: Integers may be added by observing the following rule:
a+-b=a–b. For example, 4+-3 = 4–3 = 1.
This rule, and similar rules for subtraction are best discovered using models, such as a black-and-white counters model, in which a white counter represents one, and a black counter represents -1. The first thing to establish is that opposites cancel. So, for example, 1+-1=0.
The properties of addition outlined for whole number also apply to the addition of fractions, decimals, percentages and integers.
- Whole numbers: Addition is an operation of composition. On whole numbers, addition may be described as the joining of disjoint sets. In a physical model it is represented by the bringing together of objects. Initially it involves counting the objects in the joined set to determine the sum, or result, of the operation. When the basic addition facts are known, more complex addition problems can be answered using additive strategies.
- Additive model (for time-series data)search for term
A common approach to modelling time-series data (Y) in which it is assumed that the four components of a time series; trend component (T), seasonal component (S), cyclical component (C) and irregular component (I), are added to form the values of the time series at each time period.
In an additive model the time series is expressed as: Y = T + S + C + I.
Curriculum achievement objectives reference
Statistical investigation: Level 8- Additive strategiessearch for term
Additive strategies are techniques used to solve addition problems from known facts. For example, we can change 9+6 into 10+5, so 9+6=15. Similarly, since most children learn the ‘doubles’ early on, 8+7 can be thought of as one more than 7+7. More advanced additive strategies would be such as the following: To find 47+38; shift 2 from the 47 to the 38 (i.e. partition 47 as 45+2). The problem then becomes 45+40, which can more easily be solved.
So the term ‘additive strategies’ involves the partitioning of numbers, that is the understanding that numbers can be ‘broken up’ and recombined as in the calculation of 47+38 above. It also involves methods of finding answers to subtractions such as 63-29. One strategy would be to subtract 30 from 63 to obtain 33 and then to add 1 because we have subtracted 1 too much. Another strategy would be to add 1 to each number and make the subtraction 64-30. That is effectively shifting both numbers along the number line one position, and hence the difference between them remains the same.- Anglesearch for term
An angle is the figure formed by two rays (or line segments) meeting at a point. The rays are the sides of the angle, while the point is its vertex. The size (or measure) of the angle is usually measured in degrees and is determined by the amount of rotation (or turn) about the vertex that would be required to move one side of the angle onto the other side. The size of the angle is often loosely referred to as the angle itself, e.g. "an angle of 60o."
- Angle properties of parallel linessearch for term
In Euclidean geometry, parallel lines are lines that lie in the same plane and do not intersect no matter how far they are extended. A transversal of two or more lines is a line that cuts across those lines. In the above diagram we have a transversal intersecting a pair of parallel lines. Properties and terminology of the angles created are:
- 1,2,7, and 8 are exterior angles. They are the angles outside the two parallel lines.
- 3,4,5 and 6 are interior angles. They are the angles between the two parallel lines.
- Corresponding angles are angles on the same side of the transversal and on the same side of the parallel lines. So 1 and 5 are corresponding angles, 2 and 6 are corresponding angles, 3 and 7 are corresponding angles, and 4 and 8 are corresponding angles. Corresponding angles are equal.
- Alternate interior angles are interior angles that are on opposite sides of the transversal. Thus 4 and 5 are alternate interior angles, and 3 and 6 are alternate interior angles. Alternate interior angles are equal.
- Alternate exterior angles are exterior angles that are on opposite sides of the transversal. Thus 2 and 7 are alternate exterior angles, and 1 and 8 are alternate exterior angles. Alternate exterior angles are equal.
- Opposite angles (or vertically opposite angles) are equal. 1 and 4 are opposite angles, 2 and 3 are opposite angles, 5 and 8 are opposite angles, and 6 and 7 are opposite angles.
- Angle properties of polygonssearch for term
A polygon is a portion of a plane bounded by straight lines. If n is the number of sides of the polygon then the smallest value for n is 3, which is the triangle. The interior angles of a triangle add to 180o a fact that can be easily shown by drawing a triangle on paper, tearing off the corners and putting the vertex angles together. For n=4 we have a quadrilateral which can be divided into two triangles, and so we see that the sum of the interior angles of a quadrilateral is 2 x 180o, or 360o. For n = 5 we have a pentagon which can be divided into three triangles so the sum of the interior angles of a pentagon is 3 x 180o, or 540o. Continuing in this manner we see that the sum of the interior angles of any n-sided polygon is (n-2)x 180o. This can also be written as (180n-360)o or (2n-4) right angles.
A regular polygon is a polygon whose sides are all congruent and whose angles are all equal. Hence for a regular polygon with n sides we can find the size of each interior angle by dividing the sum of the interior angles by n. A list of the regular polygons for n = 3,4,5,…,12 with their names and interior angle size is given below.No of Sides Name of regular polygon Size of each angle (in degrees) 3 Equilateral triangle 60 (180÷3) 4 Square 90 (360÷4) 5 Regular pentagon 108 (540÷5) 6 Regular hexagon 120 (720÷6) 7 Regular heptagon 128.57 (900÷7) 8 Regular octagon 135 (1080÷8) 9 Regular nonagon (or Enneagon) 140 (1260÷9) 10 Regular decagon 144 (1440÷10) 11 Regular hendecagon 147.27 (1620÷11) 12 Regular dodecagon 150 (1800÷12)
Note: If students have difficulty with the names of the polygons they could refer to them by their number of sides. For example, a pentagon could be referred to as a 5-gon, a hexagon as a 6-gon etc.
- Angle properties relating to circlessearch for term
The following are angle properties relating to circles:
- The angles on the same arc (or chord) of a circle are equal.
Angle A equals angle B because they are supported by the same arc.
- The angle at the centre is twice the angle at the circumference.
α is twice the size of β
- The angle supported by a diameter is a right angle:
- Opposite angles of a quadrilateral inscribed in a circle are supplementary, that is, they add to 180o.
So angles A and C add to 180o and angles B and D add to 180o.
- The angles on the same arc (or chord) of a circle are equal.
- Antidifferentiationsearch for term
Antidifferentiation is the reverse process of differentiation. So given a function, say f(x), it is the process of finding a function that, when differentiated, is equal to f(x). So it is the task of finding a function whose derivative is known. So, for example, if f(x) = 3x2 then an anti-derivative of f is x3. Another antiderivative of 3x2 is x3+5.
Anti-differentiation is also referred to as integration. The function to be integrated is referred to as the integrand, and the result of an integration is referred to as an integral. The indefinite integral of the function f is represented by ∫f(x)dx and is the set of all antiderivatives of f. So the integral of f(x) with respect to x is x3+C where C is a real number. That is, ∫f(x)dx = x3+C- Appropriate statistical variablessearch for term
In statistics variables refer to measurable or countable attributes such as height, number of children in a family etc. The variables being measured in a statistical survey need to be clearly defined and measurable. Furthermore, the nature of the variable needs to be considered. For example, if the nature of the variable (such as the heights of children) were such that it led to the collection of measurement data then that would not be suitable for the curriculum levels 1 to 3.
- Areasearch for term
Area is a measure of the size of a surface. It is a measure of a two dimensional surface, measuring the size of a portion of a plane. It can also be used to measure the size of a curved surface. The basic SI unit of measurement of area is the square metre (m2), with square millimetre (mm2), square centimetre (cm2), and hectare (ha) also being used. (See SI measurement units)
- Areas of polygonssearch for term
Areas of polygons can be explored using squared paper. A good sequence is to start with a square (diagram 1 below), then move to a rectangle (diagram 2 below), observing that the areas of those are simply the product of two non-parallel sides. The area of the non-rectangular parallelogram (diagram 3 below) is easily discovered by cutting a triangle from one side and joining it to the opposite side to create a rectangle. This shows that the area of a parallelogram is the product of the base and the vertical height. Cutting a parallelogram on one diagonal creates two congruent shapes and shows that the area of a triangle (diagram 4 below) is a half of the area of the associated parallelogram, that is, a half of the product of the base and the vertical height. Next, the area of a trapezium (diagram 5 below) can be found by cutting the non-parallel sides through the midpoints and rotating them to make a rectangle. This shows that the area of the trapezium is the product of the average length of the two parallel sides and the distance between them.
Areas of other polygons might be found by seeing them as combinations of the polygons mentioned above, or might require a trigonometric approach.
- Areas of rectangles, triangles, parallelograms etc.search for term
See rectangles, triangles, parallelograms etc.
- Arithmetic sequencesearch for term
An arithmetic sequence is a sequence that is such that there is a common difference between successive terms of the sequence. For example, 1, 4, 7, 10, 13, is an arithmetic sequence because there is a common difference of 3 between the terms of the sequence.
If the common difference is d then the terms of the sequence can be written as a, a+d, a+2d, a+3d, … We can see that the nth term will be a + (n-1)d The sum of the n terms of the arithmetic series is:
a + (a+d) + (a+2d) + (a+3d) +… a + (n-1)d is 1/2n[2a + (n-1)d]
This may be observed by realising that the sum of the coefficients of d is 1 + 2 + 3 + ... + (n-2) + (n-1). This sum may be found by pairing the first term with the last term which gives a sum of n, the second term with the second-to-last term which gives a sum of n etc and by realising that there are 1/2 (n-1) such terms.
Hence the sum of the first n terms is na + 1/2(n-1)nd which can be written as 1/2n[2a + (n-1)d]- Associationsearch for term
A connection between two variables. Such a connection may not be evident until the data are displayed. An association between two variables is said to exist if the connection evident in a data display is so strong that it could not be explained as only due to chance.
In particular, two numerical variables are said to have positive association if the values of one variable tend to increase as the values of the other variable increase. Also two numerical variables are said to have negative association if the values of one variable tend to decrease as the values of the other variable increase.
See: relationship
Curriculum achievement objectives references
Statistical investigation: Levels 4, 5, 6, (7), (8)- Averagesearch for term
A term used in two different ways.
When used generally, an average is a number that is representative or typical of the centre of a set of numerical values. In this sense the number used could be the mean or the median. Sometimes the mode is used. This use of average has the same meaning as measure of centre.
When used precisely, the average is the number obtained by adding all values in a set of numerical values and then dividing this total by the number of values. This use of average has the same meaning as mean.
See: measure of centre, mean, median, mode
Curriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)b- Backwards counting sequencesearch for term
A counting sequence is an ordering of the counting numbers such that the difference between any two successive numbers is constant. The basic backwards counting sequence is …, 5, 4, 3, 2, 1. An example of a backwards skip counting sequence is …, 50, 40, 30, 20,10 , as is … 10, 8, 6, 4, 2 etc.
- Bar Graphsearch for term
There are two uses of bar graphs.
First, a graph for displaying the distribution of a category variable or whole-number variable in which equal-width bars represent each category or value. The length of each bar represents the frequency (or relative frequency) of each category or value. See Example 1 below.
Second, a graph for displaying bivariate data; one category variable and one numerical variable. Equal-width bars represent each category, with the length of each bar representing the value of the numerical variable for each category. See Example 2 below.
The bars may be drawn horizontally or vertically.
Bar graphs of the first type are useful for showing differences in frequency (or relative frequency) among categories and bar graphs of the second type are useful for showing differences in the values of the numerical variable among categories.
For category data in which the categories do not have a natural ordering it may be desirable to order the categories from most to least frequent or greatest to least value of the numerical variable.
Example 1
The number of days in a week that rain fell in Grey Lynn, Auckland, from Monday 2 January 2006 to Sunday 31 December 2006 is displayed on the bar graph below.
Example 2World gold mine production for 2003 by country, based on official exports, is displayed on the bar graph below.

Alternatives: bar chart, bar plot, column graph (if the bars are vertical)Curriculum achievement objectives references
Statistical investigation: Levels (2), (3), (4), (5), (6), (7), (8)- Base ten numeration systemsearch for term
Our numeration system (the Hindu-Arabic system) is a code in which the value of a digit is determined not only by its face value but also by its place value, or in other words the position that it is in. The base of this number system is 10. That means that the value of a digit in each place in a numeral is ten times greater than the value the same digit would have were it in the place to the right of it. For example, in the numeral 333, the left-hand 3 is worth 300, the middle 3 is worth 30, and the right hand 3 is worth just 3. It is an additive numeration system so the whole numeral is worth 300+30+3.
- Basic addition and subtraction factssearch for term
The basic facts of addition are those equations in which two single-digit numbers are combined by addition to give a sum Hence they range from 0+0=0 to 9+9=18. For each basic addition fact there is a related basic subtraction fact, for example, 18-9=9. An understanding of the commutative property of addition halves the number of facts that need to be learned since 3+7 = 7+3 etc.
Addition and subtraction facts can be grouped into ‘families’ of facts, e.g., 5+4=9 so 4+5=9, 9-5=4 and 9-4=5- Basic multiplication and division factssearch for term
The basic facts of multiplication are those equations in which two single-digit numbers are combined by multiplication to give a product Hence they range from 0x0=0 to 9x9=81. For each basic multiplication fact there is a related basic division fact, for example, 81÷9=9. An understanding of the commutative property of multiplication halves the number of facts that need to be learned since 3x7 = 7x3 etc.
Multiplication and division facts can be grouped into ‘families’ of facts, e.g., 5x4=20 so 4x5=20, 20÷5=4 and 20÷4=5.- Biassearch for term
An influence that leads to results which are systematically less than (or greater than) the true value. For example, a biased sample is one in which the method used to create the sample would produce samples that are systematically unrepresentative of the population.
Note that random sampling can also produce an unrepresentative sample. This is not an example of bias because the random sampling process does not systematically produce unrepresentative samples and, if the process were repeated many times, the samples would balance out on average.
Curriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)- Binomial distributionsearch for term
A family of theoretical distributions that is useful as a model for some discrete random variables. Each distribution in this family gives the probability of obtaining a specified number of successes in a specified number of trials, under the following conditions:
• The number of trials, n, is fixed
• The trials are independent of each other
• Each trial has two outcomes; ‘success’ and ‘failure’
• The probability of success in a trial, π, is the same in each trial.Each member of this family of distributions is uniquely identified by specifying n and π. As such, n and π, are the parameters of the binomial distribution and the distribution is sometimes written as binomial(n , π).
Let random variable X represent the number of successes in n trials that satisfy the conditions stated above. The probability of x successes in n trials is calculated by:
P(X = x) =
for x = 0, 1, 2, ..., n
where
is the number of combinations of n objects taken x at a time.Example
A graph of the probability function for the binomial distribution with n = 6 and π = 0.4 is shown below.
Curriculum achievement objectives reference
Probability: Level 8- Bivariate datasearch for term
A pair of variables from a data set with at least two variables.
Example
Consider a data set consisting of the heights, ages, genders and eye colours of a class of Year 9 students. The two variables from the data set could be:
both numerical (height and age),
both category (gender and eye colour), or
one numerical and one category (height and gender, respectively).Note: Part of a Level Eight achievement objective states “including linear regression for bivariate data”. This use of bivariate data implies that both variables are numerical (i.e., quantitative variables).
Curriculum achievement objectives references
Statistical investigation: Levels (3), (4), (5), (6), (7), 8- Box and whisker plotsearch for term
A graph for displaying the distribution of a numerical variable, usually a measurement variable.
Box and whisker plots are drawn in several different forms. All of them have a ‘box’ that extends from the lower quartile to the upper quartile, with a line or other marker drawn at the median. In the simplest form, one whisker is drawn from the upper quartile to the maximum value and the other whisker is drawn from the lower quartile to the minimum value.
Box and whisker plots are particularly useful for comparing the distribution of a numerical variable for two or more categories of a category variable by displaying side-by-side box and whisker plots on the same scale. Box and whisker plots are particularly useful when the number of values to be plotted is reasonably large.
Box and whisker plots may be drawn horizontally or vertically.
Example
The actual weights of random samples of 50 male and 50 female students enrolled in an introductory Statistics course at the University of Auckland are displayed on the box and whisker plot below.
Alternatives: box and whisker diagram, box and whisker graph, box plotCurriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)c- Capacitysearch for term
Capacity is a measure of the interior volume of a container. Hence it is a measure of how much a container can hold. It is measured in units of volume. (See SI measurement units)
- Cartesian planesearch for term
See Coordinate systems.
- Category datasearch for term
Data in which the values can be organised into distinct groups. These distinct groups (or categories) must be chosen so they do not overlap and so that every value belongs to one and only one group, and there should be no doubt as to which one.
The term category data is used with two different meanings. The Curriculum uses a meaning that puts no restriction on whether or not the categories have a natural ordering. This use of category data has the same meaning as qualitative data. The other meaning restricts category data to categories which do not have a natural ordering.
ExampleThe eye colours of a class of Year 9 students.
Alternative: categorical data
See: qualitative data
Curriculum achievement objectives references
Statistical investigation: Levels 1, 2, 3, 4, (5), (6), (7), (8)- Category variablesearch for term
A property that may have different values for different individuals and for which these values can be organised into distinct groups. These distinct groups (or categories) must be chosen so they do not overlap and so that every value belongs to one and only one group, and there should be no doubt as to which one.
The term category variable is used with two different meanings. The Curriculum uses a meaning that puts no restriction on whether or not the categories have a natural ordering. This use of category variable has the same meaning as qualitative variable. The other meaning of category variable is restricted to categories which do not have a natural ordering.
Example
The eye colours of a class of Year 9 students.
Alternative: categorical variable
See: qualitative variable
Curriculum achievement objectives references
Statistical investigation: Levels (4), (5), (6), (7), (8)- Causal-relationship claimsearch for term
A statement that asserts that changes in a phenomenon (the response) are caused by differences in a received treatment or by differences in the value of another variable (an explanatory variable).
Such claims can be justified only if the observed phenomenon is a response from a well-designed and well-conducted experiment.
Curriculum achievement objectives reference
Statistical literacy: Level 8- Central Limit theoremsearch for term
The fact that the sampling distribution of the sample mean of a numerical variable becomes closer to the normal distribution as the sample size increases. The sample means are from random samples from some population.
This result applies regardless of the shape of the population distribution of the numerical variable.
The use of ‘central’ in this term is because there is a tendency for values of the sample mean to be closer to the ‘centre’ of the population distribution than individual values are. This tendency strengthens as the sample size increases.
The use of ‘limit’ in this term is because the closeness or approximation to the normal distribution improves as the sample size increases.
Curriculum achievement objectives reference
Statistical investigation: Level 8- Centred moving averagesearch for term
See: moving mean
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Chancesearch for term
A concept that applies to situations that have a number of possible outcomes, none of which is certain to occur when a trial of the situation is performed.
Two examples of situations that involve elements of chance follow.
Example 1
A person will be selected and their eye colour recorded.
Example 2
Two dice will be rolled and the numbers on each die recorded.
Curriculum achievement objectives references
Probability: All levels- Chartssearch for term
A chart is a table containing data. The ability to read charts is a social requirement. Many charts, such as tide charts or weather charts, contain information that can be transferred to other forms of representation such as a time series graph.
- Circlesearch for term
A circle of radius r units and centre P is the set of points in a plane whose distance from P is r units. The length of the circumference of a circle is π x d, where d is the diameter of the circle and π is the ratio of the circumference to the diameter. (See Perimeters of circles) The area of a circle of radius r is πr2 (i.e. π x r x r) For example, suppose a circle has a radius of 3 cm then its diameter is 6 cm, its area is 9π cm2 and the length of its circumference is 6π cm. The area of a circle of radius r can be approximated by cutting the circle into (say) 16 congruent sectors and rearranging them to approximate a rectangle with sides of length πr and r, and hence area of πr2.
- Circumferencesearch for term
The perimeter (or boundary) of a circle or an ellipse is called its circumference. The length of the circumference of a circle is πd where d is the diameter of the circle. See also Perimeters of circles.
- Class intervalsearch for term
One of the non-overlapping intervals into which the range of values of measurement data, and occasionally whole-number data, is divided. Each value in the distribution must be able to be classified into exactly one of these intervals.
Example 1 (Measurement data)The number of hours of sunshine per week in Grey Lynn, Auckland, from Monday 2 January 2006 to Sunday 31 December 2006 is recorded in the frequency table below. The class intervals used to group the values of weekly hours of sunshine are listed in the first column of the table.
Hours of sunshine Number of weeks 5 to less than 10
10 to less than 15
15 to less than 20
20 to less than 25
25 to less than 30
30 to less than 35
35 to less than 40
40 to less than 45
45 to less than 502
2
5
9
12
10
5
6
1Total 52 Example 2 (Whole-number data)
Students enrolled in an introductory Statistics course at the University of Auckland were asked to complete an online questionnaire. One of the questions asked them to enter the number of countries they had visited, other than New Zealand. The class intervals used to group the values are listed in the first column of the table.
Number of countries visited Frequency 0 – 4
5 – 9
10 – 14
15 – 19
20 – 24
25 – 29
30 – 34446
172
69
19
14
4
3Total 727 Alternatives: bin, class
Curriculum achievement objectives references
Statistical investigation: Levels (4), (5), (6), (7), (8)- Cleaning datasearch for term
The process of finding and correcting (or removing) errors in a data set in order to improve its quality
Mistakes in data can arise in many ways such as:
• A respondent may interpret a question in a different way from that intended by the writer of the question.
• An experimenter may misread a measuring instrument.
• A data entry person may mistype a value.Curriculum achievement objectives references
Statistical investigation: Levels 5, (6), (7), (8)- Cluster (in a distribution of a numerical variable)search for term
A distinct grouping of neighbouring values in a distribution of a numerical variable that occur noticeably more often than values on each side of these neighbouring values. If a distribution has two or more clusters then they will be separated by places where values are spread thinly or are absent.
In distributions with a small number of values or with values that are spread thinly, some values may appear to form small clusters. Such groupings may be due to natural variation (see sources of variation) and these groupings may not be apparent if the distribution had more values. Be cautious about commenting on small groupings in such distributions.
For the use of ‘cluster’ in cluster sampling see the description of cluster sampling.
Example 1
The number of hours of sunshine per week in Grey Lynn, Auckland, from Monday 2 January 2006 to Sunday 31 December 2006 is displayed in the dot plot below.
From the greater density of the dots in the plot we can see that the values have one cluster from about 23 to 37 hours per week of sunshine.
Example 2A sample of 40 parents was asked about the time they spent in paid work in the previous week. Their responses are displayed in the dot plot below.
There are three clusters in the distribution; a group who did a very small amount or no paid work, a group who did part-time work (about 20 hours) and a group who did full-time work (about 35 to 40 hours).Curriculum achievement objectives references
Statistical investigation: Levels (2), (3), (4), (5), (6)
Statistical literacy: Levels (2), (3), (4), (5), (6)- Cluster samplingsearch for term
A method of sampling in which the population is split into naturally forming groups (the clusters), with the groups having similar characteristics that are known for the whole population. A simple random sample of clusters is selected. Either the individuals in these clusters form the sample or simple random samples chosen from each selected cluster form the sample.
Example
Consider obtaining a sample of secondary school students from Wellington. The secondary schools in Wellington are suitable clusters. A simple random sample of these schools is selected. Either all students from the selected schools form the sample or simple random samples chosen from each selected school form the sample.
Curriculum achievement objectives references
Statistical investigation: Levels (7), (8)- Coefficient of determination (in linear regression)search for term
The proportion of the variation in the response variable that is explained by the regression model.
If there is a perfect linear relationship between the explanatory variable and the response variable there will be some variation in the values of the response variable because of the variation that exists in the values of the explanatory variable. In any real data there will be more variation in the values of the response variable than the variation that would be explained by a perfect linear relationship. The total variation in the values of the response variable can be regarded as being made up of variation explained by the linear regression model and unexplained variation. The coefficient of determination is the proportion of the explained variation relative to the total variation.
If the points are close to a straight line then the unexplained variation will be a small proportion of the total variation in the values of the response variable. This means that the closer the coefficient of determination is to 1 the stronger the linear relationship.
The coefficient of determination is also used in more advanced forms of regression, and is usually represented by R2. In linear regression, the coefficient of determination, R2, is equal to the square of the correlation coefficient, i.e., R2 = r2.
Example
The actual weights and self-perceived ideal weights of a random sample of 40 female students enrolled in an introductory Statistics course at the University of Auckland are displayed on the scatter plot below. A regression line has been drawn. The equation of the regression line is
predicted y = 0.6089x + 18.661 or predicted ideal weight = 0.6089 × actual weight + 18.661
The coefficient of determination, R2 = 0.822
This means that 82.2% of the variation in the ideal weights is explained by the regression model (i.e., by the equation of the regression line).
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Combinationssearch for term
A combination of n different objects taken r at a time is a selection of r out of the n objects with no attention given to the order of arrangement. The number of combinations of n objects taken r at a time is denoted by nCr or (nr) and is equal to n!/(r!(n - r)!) where n! (pronounced n factorial) is equal to the product of the natural numbers from 1 to n. So for example, 6! = 1 x 2 x 3 x 4 x 5 x 6 = 720.
- Combined eventsearch for term
An event that consists of the occurrence of two or more events.
Two different ways of combining two events A and B are: A or B, A and B.
A or B is the event consisting of outcomes that are either in A or B or both.
A and B is the event consisting of outcomes that are common to both A and B.
ExampleSuppose we have a group of men and women, and each person is a possible outcome of a probability activity. A is the event that a person is a woman and B is the event that a person is taller than 170cm.
Consider A and B. The outcomes in the combined event A and B will consist of the women who are taller than 170cm.
Consider A or B. The outcomes in the combined event A or B will consist of all of the women as well as the men taller than 170cm. An alternative description is that the combined event A or B will consist of all people taller than 170cm as well as the women who are not taller than 170cm.
Alternative: compound event, joint event
Curriculum achievement objectives reference
Probability: Level 8- Common factorsearch for term
An integer a is a common factor of two integers b and c if it is a factor of both b and c. Take, for example, the factors of 18 and the factors of 24. The factors of 18 are 1, 2, 3, 6, 9, and 18. The factors of 24 are 1, 2, 3, 4, 6, 8, 12, and 24. The common factors of 18 and 24 are 1, 2, 3, and 6 since those are the numbers that are factors of both 18 and 24. The greatest common factor of 18 and 24 is 6 since that is the greatest number that is a factor of both 18 and 24. The greatest common factor (also called the greatest common divisor) is often abbreviated to g.c.f. (or g.c.d.).
- Common multiplesearch for term
An integer a is a common multiple of two integers b and c if it is a multiple of both b and c. For example, the multiples of 4 are 4, 8, 12, 16, 20, 24, 28, 32, 36, 40, …The multiples of 6 are 6, 12, 18, 24, 30, 36, 42, 48, 54, 60, …The common multiples of 4 and 6 are therefore the numbers 12, 24, 36, 48, …Unlike a set of common factors, this is an infinite set, the set of multiples of 12. The primenumber factorisations of 4 and 6 are 2 x 2 and 2 x 3 respectively. So the common multiples of 4 and 6 are multiples of 2 x 2 x 3, which equals 12.
So the least common multiple (often abbreviated to l.c.m.) of 4 and 6 is 12. The least common multiple of any two integers can also be found as the product of the numbers divided by their greatest common factor, that is, for integers a and b, l.c.m. of a and b = (a x b)÷(g.c.f. of a and b) In the example above, the l.c.m. of 6 and 4 = (6 x 4)÷2 = 12.- Comparingsearch for term
The process of determining relative size or number. For example, if sticks were being compared by their length that could be done by measuring them against a standard (such as a metre rule) and comparing the measurement obtained, or by direct comparison by putting them alongside each other.
- Comparing experimental results with expectations from models of outcomessearch for term
We can compare actual results of an experiment with the expectations of a probability model to test the model or to determine the significance of the outcome of the experiment. For example, suppose our experiment is to toss a fair coin three times and count the number of heads obtained. We could repeat this experiment a large number of times and record the number of times that we obtained no heads, one head, two heads, or three heads. We could then compare the number of times that each of these events occurred with the expected number based on our probability model.
- Compass directionssearch for term
Directions on planet Earth can be given in terms of a compass (or magnetic compass) bearing. The magnetised needle of a magnetic compass is attracted to Earth’s so-called magnetic pole, a position of strong magnetic attraction in the northern hemisphere. True north is the direction from any point on the Earth to the North Pole. The North Pole is one of the two points of intersection of Earth’s surface with its axis of rotation, the other point being the South Pole. Magnetic north differs considerably from true north and changes every year as Earth’s magnetic pole changes position. In New Zealand, the magnetic deviation in 2006 was approximately 18o east of north at Kaitaia and 26o east of north at Stewart Island. The variation is roughly proportional to these figures for the length of the country. For example, the figure for Wellington is approximately 22o. So in Wellington the needle of a magnetic compass would point to a true bearing of 22o. These figures are increasing by approximately 0.5o every six years.
- Complementary eventsearch for term
With reference to a given event, the event that the given event does not occur. In other words, the complementary event to an event A is the event consisting of all of the possible outcomes that are not in event A.
There are several symbols for the complement of event A. The most common are A'; and Ā; .
ExampleSuppose we have a group of men and women, and each person is a possible outcome of a probability activity. If A is the event that a person is aged 30 years or more, then the complement of event A, A', consists of the people aged less than 30 years.
Curriculum achievement objectives reference
Probability: (Level 8)- Complex numberssearch for term
The solution of some equations, such as x2 + 1 = 0, cannot be found within the set of real numbers since it requires that we find a number x such that x2 = -1 The equation requires a value of x which, when multiplied by itself, is -1. No such number exists within the set of real numbers for the product of any two real numbers that have the same sign is always positive or zero. Hence a new number i is defined for which i2 = -1.
The set of numbers of the form a + bi where a and b are real numbers and i2 = -1 is called the set of complex numbers. If z = a + bi then we have the following possibilities:- b = 0 in which case z is a real number
- a = 0 in which case z is a pure imaginary number, such as 6i, -5i etc
- Neither a nor b is zero in which case z is called a complex number.
The set of complex numbers gives completeness to the number system since the roots of all polynomials can be found within the set of complex numbers and the nth root of any complex number is a complex number. This was obviously not true for the set of real numbers since, as above, the square root of -1 cannot be found within the set of real numbers.
Complex numbers can be represented graphically on the complex plane, a modified Cartesian plane in which the horizontal axis is called the real axis and represents the real part of the complex number, and the vertical axis is called the imaginary axis and represents the imaginary part of the complex number. The complex plane is also referred to as the Argand diagram.- Compoundingsearch for term
Quantities may change over time in applications involving growth or decay. In particular the concept of compound interest is one in which the quantity changes over time and the change is proportional to the size of the quantity. Suppose capital (the principal, P) is invested at a growth rate of r (r could be 10% say). If the amount of the investment plus the interest in year n is An then,
A0 = P
A1 = P + r P = P (1+r)
A2 = P (1+r) + r P (1+r) = P (1+r)(1+r) = P (1+r)2
A3 = P (1+r)2 + r P (1+r)2 = P (1+r)2 (1+r) = P (1+r) 3Hence the amount An after n compoundings is given by
An = PRn where R = 1+r.
A function of the form A(n) = PRn , or y = 3x is called an exponential function.- Conditional eventsearch for term
An event that consists of the occurrence of one event based on the knowledge that another event has already occurred.
The conditional event consisting of event A occurring, knowing that event B has already occurred, is written as A | B, and is expressed as ‘event A given event B’. Event B is considered to be the ‘condition’ in the conditional event A | B.
The probability of the conditional event A | B,
For a justification of the above formula see the example below.
Example
Suppose we have a group of men and women, and each person is a possible outcome of the probability activity of selecting a person. A is the event that a person is a woman and B is the event that a person is taller than 170cm.
Consider A | B.
Given that B has occurred, the outcomes of interest are now restricted to those taller than 170cm.
A | B will then be the women of those taller than 170cm.
Suppose that the genders and heights of the people were as displayed in the two-way table below.
Given that B has occurred, the outcomes of interest are the 96 people taller than 170cm.Height Taller than 170cm Not taller than 170cm Total Gender Male
Female68
2815
8983
117Total 96 104 200 If a person is randomly selected from these 96 people then the probability that the person is female is,
If both parts of the fraction are divided by 200 this becomes
Curriculum achievement objectives reference
Probability: Level 8- Conesearch for term
A circular cone is a solid whose base is a circle and whose lateral surface comes to a point. A line from the vertex of the cone to the centre of its base is called the axis. In a right cone the base is perpendicular to the axis. If the base is not perpendicular to the axis it is an oblique cone. A cone is effectively a pyramid with an infinite number of lateral faces and therefore it should not be surprising that, like a pyramid, its volume is one -third of the product of the base area and the vertical height. So for a cone whose base is a circle of radius r and whose vertical height is h, the volume (V) is given by: V = 1/3 πr2h.
- Confidence intervalsearch for term
An interval estimate of a population parameter. A confidence interval is therefore an interval of values, calculated from a random sample taken from the population, of which any number in the interval is a possible value for a population parameter.
The word ‘confidence’ is used in the term because the method that produces the confidence interval has a specified success rate (confidence level) for the percentage of times such intervals contain the true value of the population parameter in the long run. 95% is commonly used as the confidence level.
Curriculum achievement objectives reference
Statistical investigation: Level 8- Confidence levelsearch for term
A specified percentage success rate for a method that produces a confidence interval, meaning that the method has this rate for the percentage of times such intervals contain the true value of the population parameter in the long run.
The most commonly used confidence level is 95%.
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Confidence limitssearch for term
The lower and upper boundaries of a confidence interval.
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Congruentsearch for term
Two figures are congruent if they are related so that for very point on one there is a corresponding point on the other and that the distance between any two points on one is equal to the distance between the corresponding points on the other. So two figures (shapes, lines, angles etc.) are congruent if they are identical in shape and could be made to fit exactly on to each other. Fitting one figure on to the other may require turning it over. (See Direct and indirect transformations)
- Conic sectionssearch for term
The conic sections are so called because they can all be obtained as the outline of the intersection of a plane with a cone. Their equations are described by the general quadratic polynomial in two variables, x and y, which is of the form: f (x,y) = ax2 +bxy +cy2 +dx + ey +f.
The conic sections and their standard forms are:- The circle of radius r and centre (0,0), which has equation x2 + y2 = r2 A circle is the locus of a point P (x, y) that is a fixed distance from a given point. The fixed distance is the radius and the given point is the centre.
A circle may be obtained as the outline of the intersection of a right circular cone with a plane that is parallel to the base of the cone. - The ellipse with centre at (0, 0), which has equation x2/a2 + y2/b2 = 1. An ellipse is the locus of a point P (x, y), the sum of whose distances from two fixed points (called the foci – plural of focus) is constant.
An ellipse may be obtained as the outline of the intersection of a right circular cone with a plane that does not cut the base of the cone. In the case where the plane is parallel to the base we obtain a circle. - The hyperbola with centre (0, 0), which has equation x2/a2 - y2/b2 = 1.
An hyperbola is the locus of a point P (x, y), the absolute value of the difference of whose distances from two fixed points (the foci) is constant. A branch of the hyperbola may be obtained as the outline of the intersection of a right circular cone with a plane that cuts the base of the cone but is not parallel to a side (or generator) of the cone. - The parabola with centre (0, 0) and focus on the x-axis, and whose directrix is the y-axis, which has equation y2 = 4cx.
A parabola is the locus of a point P (x, y), whose distance from a fixed point (the focus) is equal to its distance from a fixed line (the directrix).
The parabola may be obtained as the outline of the intersection of a right circular cone with a plane that cuts the base of the cone and is parallel to a side (or generator) of the cone.
- The circle of radius r and centre (0,0), which has equation x2 + y2 = r2 A circle is the locus of a point P (x, y) that is a fixed distance from a given point. The fixed distance is the radius and the given point is the centre.
- Continuity of functionssearch for term
A function f (x) is continuous at x = a if the following conditions hold:
- f (a) is defined
-
lim
x → af(x) exists and -
lim
x → af(x) = f(a)
A function is discontinuous at a if one or more of the conditions for continuity fails there.
- Continuous distributionsearch for term
The variation in the values of a variable that can take any value in an (appropriately-sized) interval of numbers.
A continuous distribution may be an experimental distribution, a sample distribution or a theoretical distribution of a measurement variable. Although the recorded values in an experimental or sample distribution may be rounded, the distribution is usually still regarded as being continuous.
Example
The normal distribution is an example of a theoretical continuous distribution.
Curriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)
Probability: Levels (5), (6), 7, (8)- Continuous random variablesearch for term
A random variable that can take any value in an (appropriately-sized) interval of numbers.
Example
The height of a randomly selected individual from a population.
Curriculum achievement objectives references
Probability: Levels (7), 8- Conversions between fractions, decimals and percentagessearch for term
- Coordinate geometry techniques applied to graphs and linessearch for term
- Distance The distance between two points in the plane is found by the use of Pythagoras’ theorem. If the points are P1 (x1, y1) and P2 (x2, y2) then the distance between them is:
|P1 P2| = √ [(x2-x1)2 + (y2-y1)2] So the distance between (2,3) and (5,7) is √ (32+42) = 5 - Gradient The gradient (or slope) of a line containing two points P1 (x1, y1) and P2 (x2, y2) is:
change in y/change in x = (y2-y1)/ (x2-x1)
So the slope of the line containing (-2,3) and (5, -1) is -1-3/5--2 = - 4/7 - The equation of the line through two points P1 (x1, y1) and P2 (x2, y2) is: given by: y-y1/ x-x1 = y2-y1/ x2-x1
So the line through (2,3) and (4,7) has equation y-3/x-2 = 7-3/4-2
This can be rearranged to: y = 2x -1
- Distance The distance between two points in the plane is found by the use of Pythagoras’ theorem. If the points are P1 (x1, y1) and P2 (x2, y2) then the distance between them is:
- Coordinate systemssearch for term
There are many different possible types of coordinate system. They are designed to define the position of a point on the plane or in space. In the Cartesian coordinate system, a point in the plane can be uniquely represented by an ordered pair of numbers, each of which represents a distance along an axis, measured from the origin. An illustration of the Cartesian coordinate system, in which the axes are perpendicular, is shown below. The coordinates of point P are the ordered pair (2,3).
A simple coordinate system for younger students could be a system in which the spaces are labelled rather than points in the plane.
- Correlationsearch for term
The strength and direction of the relationship between two numerical variables.
In assessing the correlation between two numerical variables one variable does not need to be regarded as the explanatory variable and the other as the response variable, as is necessary in linear regression.
Two numerical variables have positive correlation if the values of one variable tend to increase as the values of the other variable increase.
Two numerical variables have negative correlation if the values of one variable tend to decrease as the values of the other variable increase.
Correlation is often measured by a correlation coefficient, the most common of which measures the strength and direction of the linear relationship between two numerical variables. In this linear case, correlation describes how close points on a scatter plot are to lying on a straight line.
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Correlation coefficientsearch for term
A number between -1 and 1 calculated so that the number represents the strength and direction of the linear relationship between two numerical variables.
A correlation coefficient of 1 indicates a perfect linear relationship with positive slope. A correlation coefficient of -1 indicates a perfect linear relationship with negative slope.
The most widely used correlation coefficient is called Pearson’s (product-moment) correlation coefficient and it is usually represented by r.
Some other properties of the correlation coefficient, r:
1. The closer the value of r is to 1 or -1, the stronger the linear relationship.
2. r has no units.
3. r is unchanged if the axes on which the variables are plotted are reversed.
4. If the units of one, or both, of the variables are changed then r is unchanged.Example
The actual weights and self-perceived ideal weights of a random sample of 40 female students enrolled in an introductory Statistics course at the University of Auckland are displayed on the scatter plot below.
The correlation coefficient, r = 0.906See: coefficient of determination (in linear regression), correlation
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Cosine rulesearch for term
For any triangle ABC the following law of cosines holds:
a2 = b2 + c2 - 2bc Cos A
b2 = a2 + c2 - 2ac Cos B
c2 = a2 + b2 - 2ab Cos C where a is the length of the side opposite angle A, b is the length of the side opposite angle B and c is the length of the side opposite angle C.
This rule is necessary for solving a triangle when given either two sides and the included angle, or three sides.- Countingsearch for term
Counting is the process of establishing a one-to-one correspondence between the set of objects being counted and the set of natural numbers in order. The number of objects in the set is the last number named.
- Counting Numberssearch for term
The set of counting numbers is the same as the set of natural numbers, i.e. 1, 2, 3, 4,… (See Base ten numeration system)
- Critical pathsearch for term
An activity digraph is a directed network showing the time taken to complete a certain sequence of activities. A longest path (in units of time) in an activity digraph D is called a critical path of D. It shows the minimum time necessary to complete the project.
- Cube rootsearch for term
See Roots.
- Cuboidsearch for term
A cuboid is a solid figure bounded by six rectangular faces. Hence it is like a box that has all sides rectangular. A special case of the cuboid is the square cuboid, which has two (or more) opposite faces squares. A special case of the square cuboid is the cube, which has all faces as squares.
The square cuboid could also be classified as a right square prism.
The volume of a cuboid is the product of the length of three of its sides, none of which are parallel to each other, expressed in appropriate units. (This can easily be discovered by making cuboids with blocks). For example, if a cuboid has edge lengths of 6cm, 8cm and 10cm then its volume is 480 cm3.- Curve fittingsearch for term
Very often a relationship is found to exist between two or more variables. For example, weight depends to some degree on height. We may wish to express this relationship in mathematical form by determining an equation connecting the variables. ‘The method of least squares’ is a method that finds a function which best fits the data points.
- Cyclical component (for time-series data)search for term
Long-term variations in time-series data that repeat in a reasonably systematic way over time. The cyclical component can often be represented by a wave-shaped curve, which represents alternating periods of expansion and contraction. The successive waves of the curve may have different periods.
Cyclical components are difficult to analyse and at Level Eight cyclical components can be described along with the trend.
See: time-series data
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Cylindersearch for term
A right circular cylinder (usually just called a cylinder) is a solid with three faces, whose bases are parallel circles that are perpendicular to the third face. Further, its cross-sections parallel to the bases are also circles. In common terms it is the shape of a spaghetti tin. In similarity to the volume of a cuboid, or in fact any prism, the volume of a cylinder is the product of the area of its circular base and its height. So if a cylinder has base circles of radius r and a height h then its volume is πr2h. For example, if a cylinder had a height of 9cm and a radius of 4cm then its volume would be π x 16 x 9 which is 452cm3 (to 3 dp).
d- Datasearch for term
A term with several meanings.
Data can mean a collection of facts, numbers or information; the individual values of which are often the results of an experiment or observations.
If the data are in the form of a table with the columns consisting of variables and the rows consisting of values of each variable for different individuals or values of each variable at different times, then data has the same meaning as data set.
Data can also mean the values of one or more variables from a data set.
Data can also mean a variable or some variables from a data set.
Properly, data is the plural of datum, where a datum is any result. In everyday usage the term ‘data’ is often used in the singular.
See: data set
Curriculum achievement objectives references
Statistical investigation: All levels
Statistical literacy: Levels 2, (3), (4), 5, (6), (7), (8)- Data collection methodssearch for term
Data collection methods need to be appropriate to the variable being considered. For example, if the variable being considered by children in Room 5 at Kiwi School is the amount of pocket money received by children at Kiwi School then the data must be representative of the whole school, not just of Room 5. Sampling must be random and children can consider how that could be achieved.
- Data displaysearch for term
A representation, usually as a table or graph, used to explore, summarise and communicate features of data.
Data displays listed in this glossary are: bar graph, box and whisker plot, dot plot, frequency table, histogram, line graph, one-way table, picture graph, pie graph, scatter plot, stem-and-leaf plot, strip graph, tally chart, two-way table.
Curriculum achievement objectives references
Statistical investigation: Levels 1, 2, 3, 4, 5, 6, (7), (8)
Statistical literacy: Levels 2, 3, (4), (5), 6- Data setsearch for term
A table of numbers, words or symbols; the values of which are often the results of an experiment or observations. Data sets almost always have several variables.
Usually the columns of the table consist of variables and the rows consist of values of each variable for individuals or values of each variable at different times.
Example 1 (Values for individuals)
The table below shows part of a data set resulting from answers to an online questionnaire from 727 students enrolled in an introductory Statistics course at the University of Auckland.
Individual Gender Birth month Birth year Ethnicity Number
of years
living in NZNumber of
countries
visitedActual
weight
(kg)Ideal
weight
(kg)1
2
3
.
.
.female
female
male
.
.
.Jan
Nov
Jan
.
.
.1984
1990
1990
.
.
.Other European
Chinese
NZ European
.
.
.2
15
18
.
.
.3
11
2
.
.
.55
53
68
.
.
.50
49
60
.
.
.Example 2 (Values at different times)
The table below shows part of a data set resulting from observations at a weather station in Rolleston, Canterbury, for each day in November 2008.
Day Max temp (°C) Rainfall (mm) Max pressure (hPa) Max wind gust (km/h) 1
2
3
.
.
.26.8
19.7
19.5
.
.
.0.5
0.0
0.0
.
.
.1015.1
1015.6
1011.1
.
.
.70.3
38.9
29.6
.
.
.
Alternative: datasetCurriculum achievement objectives references
Statistical investigation: Levels 3, (4), 5, (6), 7, 8- Decimal place valuesearch for term
See rounding
- Decimalssearch for term
A number may have many numerals and one commonly used form of numeral is the ‘decimal’, or more correctly, ‘decimal fraction’. This is a system which extends the base ten numeration system to have place values less than 1. For example, whereas 324 is a compact numeral which can be written in expanded form as (3x100) + (2x10) + (4x1), we can extend this system to include fractional parts. For example, 324 15/100 could be written as 324.15 which means (3x100) + (2x10) + (4x1) + (1x1/10) + (5x1/100). So decimals are another way of recording fractional parts and are an extension of the base 10 numeration system. Obviously there is no restriction on the length of the decimal part, the part to the right of the decimal point. Other ways of recording parts of a whole are common fractions and percentages.
The full benefit of having a positional notation numeration system such as the system of decimal fractions is in having a system of units of measurement that is in harmony with it. Hence we see the importance of the metric system to industry and to society in general. (See SI measurement units)- Denominatorsearch for term
When a rational number is written as a fraction, that is in the form a/b then b is called the denominator of the fraction.
- Dependent variablesearch for term
A common alternative term for the response variable in bivariate data.
Alternatives: outcome variable, output variable, response variable
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Descriptive statisticssearch for term
Numbers calculated from a data set to summarise the data set and to aid comparisons within and among variables in the data set.
Alternatives: numerical summary, summary statistics
Curriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)- Differential equationssearch for term
A differential equation is an equation that involves derivatives. For example: y’ = x+5, y’’+3y’+2y = 0 etc.
If there is a single independent variable such as above, the equation is called an ordinary differential equation.
The order of a differential equation is the order of the highest derivative that occurs.- Differentiationsearch for term
Differentiation is the process of finding the derivative of a function. The derivative of a function is a new function whose domain consists of those points where the former function is differentiable, and whose values are the slopes of tangent lines at the corresponding points. Thus the value of the derivative of a function at any given point in the domain is a measure of the rate of change of the function at that point.
The derivative of a function f(x) is denoted by f ‘(x); its value at the point x is the slope of the line tangent to f at the point x. The term ‘derivative’ is based on the fact that the new function is derived from the original one.
The derivative of a function f(x) with respect to x is defined as follows:
f‘(x) = lim f (x + h) - f(x) provided the limit exists. hh → 0 0 h
It can be shown that if f(x) = axn then f‘(x) = naxn-1 and that (f (x) + g (x))‘ = f‘(x) + g‘(x). These results enable the differentiation of polynomials.
Note: An alternative notation for the derivative of y with respect to x is dy/dx.
Approximate values of derivatives of functions may be found by using numerical differentiation techniques and formulas such as
f‘(x) = [f (x+h) - f (x)]/h and f‘(x) = [f (x+h) - f (x - h)]/2h- Direct and indirect relationships with linear proportionssearch for term
A quantity a is said to vary directly as another quantity b if a = kb where k is a constant. (This is often said as "a is directly proportional to b"). k is called the constant of proportionality.
For example, the length of the circumference of a circle (c) varies directly as the radius (r) and is expressed by the relationship c = 2πr. So if the radius doubles then so does the circumference.
A quantity a is said to vary inversely as another quantity b if a = k x 1/b where k is a constant of proportionality.
(This is often said as "a is inversely proportional to b"). For example, suppose the distance between two towns A and B is 30 km. The time (t) taken to travel from A to B is inversely proportional to the speed (s) of travel.
So t=30/s. If s=30 kph then t=1 (hour). If s doubles then t is halved.- Direct and indirect transformationssearch for term
Transformations can be classified as direct or indirect. The direct isometries are rotation and translation. They are called direct because they do not flip (or turn over) the shape being transformed. Reflection and glide reflection are indirect isometries because they do flip the shape being transformed.
- Direct comparisonsearch for term
The process of comparing directly rather than through an independent standard. For example, we can compare the number of elements in two given sets by counting the elements of each set and determining which is the greater number. In direct comparison we would match the elements against each other physically. Similarly, to determine which of two sticks is the longer we could use direct comparison by laying the sticks side-by-side and determining the longer one by observation, rather than measuring the two sticks against a standard measure.
- Directionsearch for term
The direction between two points A and B is the description of the path an object (or person) travelling from A to B would take. This could be in simple social terms such as forwards or backwards, left or right; in terms of compass directions such as north, south, south-east etc.; in terms of an angle of turn from an origin; or as a direction vector.
- Discontinuities of functionssearch for term
- Discrete distributionsearch for term
The variation in the values of a variable that can only take on distinct values, usually whole numbers.
A discrete distribution could be an experimental distribution, a sample distribution or a theoretical distribution.
Example
The binomial distibution is an example of a theoretical discrete distribution.
Curriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)
Probability: Levels 5, 6, 7, (8)- Discrete random variablesearch for term
A random variable that can take only distinct values, usually whole numbers.
Example
The number of left-handed people in a random selection of 10 individuals from a population is a discrete random variable. The distinct values of the random variable are 0, 1, 2, … , 10.
Curriculum achievement objectives reference
Probability: Level 8- Discrete situationssearch for term
Situations involving elements of chance in which the outcomes can take only distinct values.
If the outcomes are categories then this is a discrete situation. If the outcomes are numerical then the distinct values are often whole numbers.
Curriculum achievement objectives reference
Probability: Level 6- Disjoint eventssearch for term
Alternative: mutually exclusive events
Curriculum achievement objectives reference
Probability: (Level 8)- Distancesearch for term
The distance between any two points on a line, on a plane or in space is the length of the straight line between them. This could be expressed in non-standard units such as steps, handspans etc., or in standard units such as meters, kilometres etc. (See length)
- Distributionsearch for term
The variation in the values of a variable.
The type of distribution can be described by the type of variable (e.g., continuous distribution, discrete distribution) or by the way the values were obtained (e.g., experimental distribution, population distribution, sample distribution). Other types of distribution described in this glossary are frequency distribution, sampling distribution, theoretical distribution.
See: continuous distribution, discrete distribution, experimental distribution, frequency distribution, population distribution, sample distribution, sampling distribution, theoretical distribution
Curriculum achievement objectives references
Statistical investigation: Levels 4, 5, 6, (7), (8)
Probability: Levels 4, 5, 6, 7, 8- Division ofsearch for term
- Whole numbers: Division is the inverse operation of multiplication. It arises from contexts that involve sharing or measurement. Both sharing and measurement can be viewed as repeated subtraction. For each basic fact of multiplication there is a family of facts that includes two basic facts of division. For example, 5 x 7 = 35 so 7 x 5 = 35, 35÷7 = 5 and 35÷5 = 7.
Division is a binary operation, that is, it is an operation on two numbers.
Division is not commutative, that is, the order of the numbers does matter. For example, 12÷4 ≠ 4÷12.
Division is not associative, that is, the grouping of the numbers does affect the answer. For example, (12÷6) ÷2 = 1, whereas 12÷(6 ÷2) = 4. - Fractions: The operation of division may be performed on fractions. Division by a number is essentially multiplication by the reciprocal of the number. For example, 12÷4 = 12 x 1/4. Similarly, 2/3÷4/5 = 2/3 x 5/4 = 10/12
- Decimals: Decimal fractions (commonly called decimals) may be divided in the same way that whole numbers are, with care being taken to consider the position of the decimal point. It is often best to leave the decimal point out while operating on the numbers and put the decimal point in the answer by estimation.
- Percentages: Percentages may be divided but the answer needs to be interpreted carefully. For example, 50%÷40% = 50/100÷40/100 = 50/100 x 100/40 = 5/4. The answer is a fraction and if it is necessary to express the answer as a percentage then it would be necessary to convert it to a percentage, namely 125%
- Integers: Integers may be divided in the following way: a÷-b = -(a÷b) e.g. 6÷-3 = -2
-a÷b = - (a÷b) e.g. -6÷3 = -2
-a÷ -b = a÷b e.g. -6÷-3 = 2
These results are probably best discovered from their associated multiplication results. (See Multiplication of integers). The properties of division outlined for whole number also apply to the division of fractions, decimals, percentages and integers.
- Whole numbers: Division is the inverse operation of multiplication. It arises from contexts that involve sharing or measurement. Both sharing and measurement can be viewed as repeated subtraction. For each basic fact of multiplication there is a family of facts that includes two basic facts of division. For example, 5 x 7 = 35 so 7 x 5 = 35, 35÷7 = 5 and 35÷5 = 7.
- Domainsearch for term
See Function.
- Dot plotsearch for term
A graph for displaying the distribution of a numerical variable in which each dot represents each value of the variable.
For a whole-number variable, if a value occurs more than once, the dots are placed one above the other so that the height of the column of dots represents the frequency for that value.
Dot plots are particularly useful for comparing the distribution of a numerical variable for two or more categories of a category variable by displaying side-by-side dot plots on the same scale. Dot plots are particularly useful when the number of values to be plotted is relatively small.
Dot plots are usually drawn horizontally, but may be drawn vertically.
Example
The actual weights of random samples of 50 male and 50 female students enrolled in an introductory Statistics course at the University of Auckland are displayed on the dot plot below.
Alternative: dot graph, dotplot
Curriculum achievement objectives references
Statistical investigation: Levels (3), (4), (5), (6), (7), (8)- Drawings and modelssearch for term
Objects may be represented by drawings or models. The drawing or model and the object it represents may be similar in that the drawing or model may be a scale representation of the object. However drawings in particular can often be used to good effect in mathematics to represent relationships between elements, and the drawings might have no scale representation to the relationships they model.
Isometric plan views and nets may be used to effect. Drawing a picture is a useful problem solving strategy and such a picture might be approximately to scale or might have no scale relationship.e- Elements of chancesearch for term
See Chance.
- Enlargementsearch for term
An enlargement in the plane or in space is a mapping of a set of points such that for each point the distance of its image from a fixed point (the centre of enlargement) is a given multiple of the distance from the point to the centre of enlargement. For example, if the fixed multiple were 2 then every point of the image of the transformation would be twice the distance from the centre of the enlargement that the original point was. If it were a figure being transformed then all the length measurements of the figure would be doubled under that specific transformation. . If the multiple were 3 then all the length measurements of the image would be three times the length measurements of the original figure. The centre of enlargement is the only invariant point under the transformation of enlargement. Enlargement is not an isometric transformation since although the shape is similar it is not preserved because the dimensions have been changed.
- Equal and different likelihoodssearch for term
Events can have equal likelihoods or they can have different likelihoods. If a fair coin is tossed, the likelihood of getting a ‘head’ is equal to the likelihood of getting a ‘tail’. However if we draw a card from a standard pack of cards, the likelihood of getting a king is not equal to the likelihood of getting a red card.
- Equal-sharingsearch for term
Equal sharing is a division concept based on the action of distributing the elements of a set evenly amongst a given number of subsets. E.g. Grandma shares $24-00 evenly amongst her four grandchildren. How much will each grandchild receive? Strategies for solving this problem will depend on a student’s level of numeracy understanding and could include dealing (physically sharing out), finding four equal addends, and inverse multiplication (i.e. what do I multiply by 4 to get 24?).
- Equalitysearch for term
The equality relation is fundamental for numbers and is usually taken as understood. It is understood that things that are equal, have all and only the same properties. Of importance are the symmetric and transitive properties of the relation ‘equals’, namely that if a=b then b=a (symmetric property), and if a=b and b=c then a=c (transitive property) for all numbers a, b, and c, The symbol "=" often evokes a meaning of "the answer follows" which, while being one interpretation of the symbol, is unhelpful in understanding the symmetric and transitive properties of the relation.
- Equivalent decimal and percentage forms for everyday fractionssearch for term
Decimals, percentages, and fractions are the three main numeral systems used to represent parts of a whole. For example, the proportion that is one part out of two parts can be represented as 1/2, or 0.5, or 50%. Children can use two- or three- dimensional models such as 100s blocks and place value rods to explore the relationships between these three numeral systems. Older students can use the implied division operation of a fraction to convert fractions to decimals and percentages.
- Equivalent fractionssearch for term
Two different fractions that represent the same number are referred to as equivalent fractions. For example, 1/2, 2/4, 3/6, and 4/8 are equivalent fractions because they represent the same number.
In the example given, 1/2 is said to be in irreducible form because the numerator and denominator have no common factor. The others are all reducible.- Estimatesearch for term
A number calculated from a sample, often a random sample, which is used as an approximate value for a population parameter.
Example
A sample mean, calculated from a random sample taken from a population, is an estimate of the population mean.
Alternative: point estimate
See: interval estimate
Curriculum achievement objectives references
Statistical investigation: Levels (6), 7, 8- Eventsearch for term
A collection of outcomes from a probability activity or a situation involving an element of chance.
An event that consists of one outcome is called a simple event. An event that consists of more than one outcome is called a compound event.
Example 1
In a situation where a person will be selected and their eye colour recorded; blue, grey or green is an event (consisting of the 3 outcomes: blue, grey, green).
Example 2
In a situation where two dice will be rolled and the numbers on each die recorded, a total of 5 is an event (consisting of the 4 outcomes: (1, 4), (2, 3), (3, 2), (4, 1), where (1, 4) means a 1 on the first die and a 4 on the second).
Example 3
In a situation where a person will be selected at random from a population and their weight recorded, heavier than 70kg is an event.
Curriculum achievement objectives references
Probability: Levels (5), (6), (7), 8- Expected value (of a discrete random variable)search for term
The population mean for a random variable and is therefore a measure of centre for the distribution of a random variable.
The expected value of random variable X is often written as E(X) or µ or µX.
The expected value is the ‘long-run mean’ in the sense that, if as more and more values of the random variable were collected (by sampling or by repeated trials of a probability activity), the sample mean becomes closer to the expected value.
For a discrete random variable the expected value is calculated by summing the product of the value of the random variable and its associated probability, taken over all of the values of the random variable.
In symbols, E(X) = Σ x P(X = x)
Example
Random variable X has the following probability function:
x 0 1 2 3 P(X = x) 0.1 0.2 0.4 0.3 E(X) = 0 x 0.1 + 1 x 0.2 + 2 x 0.4 + 3 x 0.3
= 1.9A bar graph of the probability function, with the expected value labelled, is shown below.
See: population meanCurriculum achievement objectives reference
Probability: Level 8- Experimentsearch for term
In its simplest meaning, a process or study that results in the collection of data, the outcome of which is unknown.
In the statistical literacy thread at Level Eight, experiment has a more specific meaning. Here an experiment is a study in which a researcher attempts to understand the effect that a variable (an explanatory variable) may have on some phenomenon (the response) by controlling the conditions of the study.
In an experiment the researcher controls the conditions by allocating individuals to groups and allocating the value of the explanatory variable to be received by each group. A value of the explanatory variable is called a treatment.
In a well-designed experiment the allocation of subjects to groups is done using randomisation. Randomisation attempts to make the characteristics of each group very similar to each other so that if each group was given the same treatment the groups should respond in a similar way, on average.
Experiments usually have a control group, a group that receives no treatment or receives an existing or established treatment. This allows any differences in the response, on average, between the control group and the other group(s) to be visible.
When the groups are similar in all ways apart from the treatment received, then any observed differences in the response (if large enough) among the groups, on average, is said to be caused by the treatment.
Example
In the 1980s the Physicians’ Health Study investigated whether a low dose of aspirin had an effect on the risk of a first heart attack for males. The study participants, about 22,000 healthy male physicians from the United States, were randomly allocated to receive aspirin or a placebo. About 11,000 were allocated to each group.
This is an experiment because the researchers allocated individuals to two groups and decided that one group would receive a low dose of aspirin and the other group would receive a placebo. The treatments are aspirin and placebo. The response was whether or not the individual had a heart attack during the study period of about five years.
See: causal-relationship claim, placebo, randomisation
Curriculum achievement objectives references Statistical investigation: Levels 5, (6), 7, 8 Statistical literacy: Level 8
- Experimental design principlessearch for term
Issues that need to be considered when planning an experiment.
The following issues are the most important:
Comparison and control: Most experiments are carried out to see whether a treatment causes an effect on a phenomenon (response). In order to see the effect of a treatment, the treatment group needs to be able to be compared fairly to a group that receives no treatment (control group). If an experiment is designed to test a new treatment then a control group can be a group that receives an existing or established treatment.
Randomisation: A randomising method should be used to allocate individuals to groups to try to ensure that all groups are similar in all characteristics apart from the treatment received. The larger the group sizes, the better the balancing of the characteristics, through randomisation, is likely to be.
Variability: A well-designed experiment attempts to minimise unnecessary variability. The use of random allocation of individuals to groups reduces variability among the groups, as does larger group sizes. Keeping experimental conditions as constant as possible also restricts variability.
Replication: For some experiments it may be appropriate to carry out repeated measurements. Taking repeated measurements of the response variable for each selected value of the explanatory variable is good experimental practice because it provides insight into the variability of the response variable.
Curriculum achievement objectives reference
Statistical investigation: Level 8- Experimental distributionsearch for term
The variation in the values of a variable obtained from the results of carrying out trials of a situation that involves elements of chance, a probability activity, or a statistical experiment.
For whole-number data, an experimental distribution is often displayed, in a table, as a set of values and their corresponding frequencies, or on an appropriate graph.
For measurement data, an experimental distribution is often displayed, in a table, as a set of intervals of values (class intervals) and their corresponding frequencies, or on an appropriate graph.
For category data, an experimental distribution is often displayed, in a table, as a set of categories and their corresponding frequencies, or on an appropriate graph.
Alternative: empirical distribution
See: sample distribution
Curriculum achievement objectives references
Probability: Levels 4, 5, 6, 7- Explanatory variablesearch for term
The variable, of the two variables in bivariate data, knowledge of which may provide information about the other variable, the response variable. Knowledge of the explanatory variable may be used to predict values of the response variable, or changes in the explanatory variable may be used to predict how the response variable will change.
If the bivariate data result from an experiment then the explanatory variable is the one whose values can be manipulated or selected by the experimenter.
In a scatter plot, as part of a linear regression analysis, the explanatory variable is placed on the
x-axis (horizontal axis).Alternatives: independent variable, input variable, predictor variable
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Exploratory data analysissearch for term
The process of identifying patterns and features within a data set by using a wide range of graphs and summary statistics. Exploratory data analysis usually starts with graphs and summary statistics of single variables and then extends to pairs of variables and further combinations of variables.
Exploratory data analysis is an essential part of the statistical enquiry cycle. It is important at the cleaning data stage because graphs may reveal data that need checking with regard to quality of the data set.
For data sets about populations exploratory data analysis will reveal important features of the population, and for data sets from samples it will reveal features of the sample which may suggest features in the population from which the sample was taken.
For bivariate numerical data exploratory data analysis will indicate whether it is appropriate to fit a linear regression model to the data.
For time-series data exploratory data analysis will indicate whether it is appropriate to fit an additive model to the time-series data.
Curriculum achievement objectives references
Statistical investigation: Levels (1), (2), (3), (4), (5), (6), 7, 8- Exponentsearch for term
See Powers.
- Exponential equationssearch for term
An exponential equation is an equation in which the variable appears in an exponent (See Roots and Powers). So y = 3x is an exponential equation because y is a function of x and the variable x appears as an exponent. Equations of the form ax = an and ax = bx can be solved easily, while other equations may be solved by the use of logarithms. Example; 3x = 9x-2 Hence 3x = 32x-4 so x = 2x – 4. therefore x = 4.
- Extrapolationsearch for term
The process of estimating the value of one variable based on knowing the value of the other variable, where the known value is outside the range of values of that variable for the data on which the estimation is based.
Curriculum achievement objectives references
Statistical investigation: Levels 7, (8)f- Factorsearch for term
An integer a is a factor of an integer b if a divides b. It is often helpful to state the number set that we are working in. For instance, if we operate within the set of natural numbers, then the factors of 6 are 1, 2, 3, and 6 since those are the natural numbers that divide 6. Similarly, the factors of 28 are 1, 2, 4, 7, 14, and 28. (See also Common factor)
- Factorialsearch for term
n! (pronounced n factorial) is equal to the product of the natural numbers from 1 to n. So for example, 6! = 1 x 2 x 3 x 4 x 5 x 6 = 720
- Features (of distributions)search for term
Distinctive parts of distributions which usually become apparent when the distribution is presented in a data display. The parts worthy of comment will depend on the type of display and how clearly the part stands out.
Curriculum achievement objectives references
Statistical investigation: Levels (2), (3), (4), (5), 6, (7), (8)
Statistical literacy: Levels 2, (3), (4), (5), (6), (7), (8)- Features of simple data displayssearch for term
At an early stage children can observe and identify features of simple data displays – features such as greatest frequency, least frequency, how spread out the data is, mode, ‘middle’, and unusual values such as outliers.
- Five-number summarysearch for term
Five numbers that form a summary for the distribution of a numerical variable. The five numbers are: minimum value, lower quartile, median, upper quartile, maximum value. Together they convey quite a lot of information about the features of the distribution.
Example
The five-number summary for the weights of the 302 male students who answered an online questionnaire given to students enrolled in an introductory Statistics course at the University of Auckland is 51kg, 65kg, 72kg, 81kg, 140kg.
Curriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)- Forecastsearch for term
An assessment of the value of a variable at some future point of time, often based on an analysis of time-series data.
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Forward Counting Sequencessearch for term
A counting sequence is an ordering of the counting numbers such that the difference between any two successive numbers is constant. The basic forward counting sequence is 1, 2, 3, 4, 5,…
An example of a forward skip counting sequence is 10, 20, 30, 40, 50, …, as is 2, 4, 6, 8, 10, … etc.- Fractionsearch for term
A fraction is a numeral of the form a/b where a and b are both integers and b≠0. If the fraction lies between –1 and 1 then the fraction is called a proper fraction. (e.g. 1/2, 3/5, -2/7 etc), otherwise it is called an improper fraction (e.g. 11/5, 257/17, -3/2 etc). In the example 2/7, the two is called the numerator, and the 7 is called the denominator. If the fraction has arisen from the part-whole concept then the whole has been divided into seven equal parts and the 2 represents the number of the parts. Fractions are useful for representing parts of a whole, that is, they are numerals that can be used when whole numbers cannot describe a certain number.
- Fraction, decimal and percentage conversionssearch for term
Fractions, decimals and percentages can be used to represent numbers that are not integers, that is, they include parts of a whole. Consequently for each number written as a numeral in one of these three forms there is a corresponding numeral written in each of the other two forms. Initially, connections between these numerals should be made by exploration with equipment such as a linear model for decimals and fractions, a closed abacus or a hundreds grid for percentages and sets and regional models for fractions.
After the concepts of the conversions have been fully explored with equipment, converting a fraction to a decimal may be done abstractly by using the division property of a fraction, namely that a/b = a÷b. So, for example, 3/4 = 3÷4=0.75 etc.
Since a percentage represents a proportion out of 100, the decimal for 3/4 may be multiplied by 100 to obtain the percentage for 3/4. (Students should have already observed this from the use of equipment.) So 3/4 = 75% Converting decimals or percentages to fractions involves using the definitions of the numerals. For example, 1/8 = 0.125, so converting 0.125 to a fraction simply involves writing the number as 125/1000. This can be reduced to 1/8 as they are equivalent fractions. Fractions that should be commonly known as decimals and percentages include halves, thirds, quarters, fifths, eighths, tenths, twentieths, twenty-fifths, and fiftieths.- Fractions, decimals and percentages of numberssearch for term
- Whole numbers: Students can use multiplication properties to answer simple problems involving fractions of whole numbers. For example, 1/4 of 28 can be answered by realising that 4 x 7 = 28. Therefore, 1/4 of 28 is 7.
Decimal fractions of whole numbers involve problems such as finding 0.5 of 8. Students can do this by knowing that 0.5 = 1/2 and by taking 1/2 of 8, or by multiplying 5 by 8 and realising that the answer is 4, not 40. Similarly, 0.8 of 9 must be 0.72 because 8 x 9 = 72 and the answer must be a bit less than 9. It can be confirmed by realising that 0.8 of 9 equals 8/10 of 9 and that 8/10 x 9 = 72/10.
Percentages of whole numbers may be found in the same way as decimals of whole numbers, using appropriate multiplicative strategies For example, 80% of 9 is 7.2 because 9 x 80 = 720 and the answer must be a bit less than 9. Knowing that 10% = 1/10 can be helpful. For example, 35% of 40 is 4+4+4+2 since 10% of 40 is 4 and 5% is half of 10%. So 35% of 40 is 14.
Again, results can be confirmed by converting to fractions and using fraction multiplication. - Fractions: Students can find fractions of simple fractions by using models such as a grid. For example, to find 1/4 0f 3/5, a 4 by 5 grid can be used with the quarter being shaded in one direction and the three=fifths in the other direction. It is then obvious that 1/4 of 3/5 is 3/20. Folding a rectangular sheet of paper and shading in the fractions in each direction is also a good approach.
Decimal fractions of simple fractions involves finding answers to questions such as: "What is 0.25 of 1/3?" Students can answer this by using their knowledge of decimals. Then this problem becomes: "What is 1/4 of 1/3?" and can be answered using a grid (See Fractions above).
Percentages of simple fractions involve finding answers to questions such as: "What is 25% of 1/3?" Again students can use their knowledge of percentages and convert the problem to a fraction of a fraction as above. - Percentages: Finding percentages of decimals involves finding answers to questions such as: "What is 40% of 0.4?" Students might reason this as 10% of 0.4 is 0.04 so 40% of 0.4 must be 0.16. Or they might reason that 40% = 0.4 and since 4 x 4 = 16 the answer must be 0.16 because it is a little less than half of 0.4. Alternatively they might reason that 40% of 0.4 is equal to 4/10 x 4/10 and by fraction multiplication arrive at 16/100, which is 0.16.
- Whole numbers: Students can use multiplication properties to answer simple problems involving fractions of whole numbers. For example, 1/4 of 28 can be answered by realising that 4 x 7 = 28. Therefore, 1/4 of 28 is 7.
- Frequencysearch for term
For a whole-number variable in a data set, the number of times a value occurs.
For a measurement variable in a data set, the number of occurrences in a class interval.
For a category variable in a data set, the number of occurrences in a category.
See: relative frequency, tally chart
Curriculum achievement objectives references
Statistical investigation: Levels (4), (5), (6), (7), (8)- Frequency distributionsearch for term
For whole-number data, a set of values and their corresponding frequencies displayed in a table, or the set of values displayed on an appropriate graph.
For measurement data, a set of intervals of values (class intervals) and their corresponding frequencies displayed in a table, or the set of values displayed on an appropriate graph.
For category data, a set of categories and their corresponding frequencies displayed in a table or graph.
See: experimental distribution, frequency table, sample distribution
Curriculum achievement objectives references
Statistical investigation: Levels (4), (5), (6), (7), (8)
Probability: Levels (4), (5), (6), (7), (8)- Frequency tablesearch for term
Any table that displays the frequencies of values of one or more variables in a data set.
For a whole-number variable in a data set, a table showing each value of the variable and its corresponding frequency.
For a measurement variable in a data set, a table showing a set of class intervals for the variable and the corresponding frequency for each interval.
For a category variable in a data set, a table showing each category of the variable and its corresponding frequency.
A frequency table will often have an extra column that shows the percentages that fall in each value, class interval or category.
See: one-way table, two-way table
ExampleThe number of days in a week that rain fell in Grey Lynn, Auckland, from Monday 2 January 2006 to Sunday 31 December 2006 is recorded in the frequency table below.
Number of days with rain Number of weeks 0
1
2
3
4
5
6
72
5
5
5
19
6
6
4Total 52
A frequency table with more information is shown below.
Number of days with rain, x Number of weeks Percentage Percentage of weeks with x or fewer days of rain 0
1
2
3
4
5
6
72
5
5
5
19
6
6
43.8%
9.6%
9.6%
9.6%
36.5%
11.5%
11.5%
7.7%3.8%
13.5%
23.1%
32.7%
69.2%
80.8%
92.3%
100.0%Total 52
Curriculum achievement objectives references
Statistical investigation: Levels (3), (4), (5), (6), (7), (8)
Probability: Levels (3), (4), (5), (6)- Functionsearch for term
A function consists of two things:
- A set of elements called the domain, and another set of elements called the range.
- A rule for associating each element of the domain with exactly one element of the range.
The domain is often called the set of values of the independent variable, and the range is the set of values of the dependent variable. A typical value in the domain is usually denoted x, and is called the independent variable; and a typical value in the range is denoted y and is called the dependent variable.
g- Generalisingsearch for term
Mathematics is considered to be a deductive science, working from proven general results (theorems etc.) to the specific. However the best practice in teaching new concepts to children is to work inductively from specific discoveries and observations of patterns to generalised conclusions. This is referred to as generalising, or using inductive reasoning.
For example, 1+3=4, 3+5=8, 7+9=16, 13+5=18. From these results we might generalise that the sum of two odd numbers is an even number.- Geometric propertiessearch for term
Geometric properties by which children can sort geometric shapes and objects include number of vertices, number of edges, number of faces, types of face, symmetry, curvature, thickness, dimension, size of vertex angle etc.
- Geometric sequencessearch for term
A geometric sequence is a sequence of the form: a, ar, ar2, ar3, ar4, … arn-1, …
For example, 1, 2, 4, 8, 16, 32, 64, … is a geometric sequence in which r, the common ratio, is 2, and a, the first term, is 1.
The sum (Sn) of the finite geometric series a+ar+ar2+ar3+ar4+ …+ arn-1 consisting of n terms is:
S n = [a(1-rn)]/(1-r) where a is the first term and r is the common ratio.
If |r| < 1 then the sum to infinity of the geometric series a+ar+ar2+ar3+ar4+ …+ arn-1 + … is S=a/(1-r) where |r| is the absolute value of r.
For example 1/2 + 1/4 + 1/8 + 1/16 + … is a geometric series with a=1/2 and r=1/2.
So S = 1/2 ÷ 1/2 = 1- Glidesearch for term
A glide (also called a glide reflection) is a transformation in the plane that is the composition of a reflection in a mirror line followed by a translation parallel to the mirror line. It is a necessary addition to the isometries if we wish to explain all isometric symmetry movements in terms of one isometry.
- Gradientsearch for term
- Graphs, tables and rulessearch for term
A graph is a visual representation of data or of a relationship of some kind. In its simplest form a graph is any set of points in a plane. Different types of graph best serve different purposes. For comments on statistical data graphs see Data displays;. Linear (See Linear equations) and non-linear relationships found in number patterns are often displayed on the Cartesian plane (See Coordinate systems).
Some relationships might also be shown quite well by way of a table, and relationships can be described by a rule.
For example, the diagram above shows a tiling pattern of concrete slabs laid around central grassed areas. The number of slabs needed is given by the sequence 8, 12, 16,…
This could be shown by way of a table:
Side length of grass square 1 2 3 4 5 6 … No. of slabs 8 12 16 20 24 28
Or a graph:
Or a rule:
The number of slabs needed is four times the number of the term plus 4. This could be expressed algebraically as s=4n+4.- Graphs, tables and rules for simple quadratic relationshipssearch for term
As shown above for linear relationships, graphs, tables and rules can be developed for simple quadratic equations. Consider the pattern of balls set up in the triangular arrangements shown below:
We could draw up the following table for the number of balls in each term:
n 1 2 3 4 5 … n Term n: T(n) 1 3 6 10 15
Looking at the table, students will see that to get the nth term, you add n to the (n-1)th term. So T(n) = T(n-1)+n.
That is helpful if you already know the (n-1)th term. Closer observation should lead to the discovery that the number of balls in each term is half the product of the term number and the next term number. This leads to the relation:
T(n) = n(n+1)/2
= 1/2 n2 + 1/2 n
This is a polynomial equation of degree two since the highest power to which the variable n is raised is two. Its graph will be a part of a parabola. Polynomial equations of degree two are also referred to as quadratic equations.- Greatest common factorsearch for term
See Common factor.
- Grid referencessearch for term
(See also Coordinate systems). Grid references on a map are similar to coordinates in the Cartesian plane. For the major New Zealand topographical map series (NZMS 260) the grid references are given as a six-digit number, the first three digits being the distance east (in kilometres to one decimal place) and the second three digits being the distance north (in kilometres to one decimal place) from a starting origin. Simple grid systems can be added to maps for children by labelling (with letters or numbers) the bottom horizontal border of the map and the left-hand vertical border. Children can then read the coordinates of the map by reading the horizontal reference first followed by the vertical reference, thus simulating the coordinate system of the Cartesian plane. Activities can then include things such as a coordinate journey through the map describing the view according to the features shown on the map.
- Groupingsearch for term
Putting sets of objects into groups, usually according to some attribute e.g. colour, size, shape etc.
h- Half turnsearch for term
A rotation through 180o, or half of a complete rotation.
- Hectaresearch for term
See SI measurement units – Area.
- Histogramsearch for term
A graph for displaying the distribution of a measurement variable consisting of vertical rectangles, drawn for each class interval, whose area represents the relative frequency for values in that class interval.
To aid interpretation, it is desirable to have equal-width class intervals so that the height of each rectangle represents the frequency (or relative frequency) for values in each class interval.
Histograms are particularly useful when the number of values to be plotted is large.
Example
The number of hours of sunshine per week in Grey Lynn, Auckland, from Monday 2 January 2006 to Sunday 31 December 2006 is displayed in the histogram below.
Curriculum achievement objectives references
Statistical investigation: Levels (4), (5), (6), (7), (8)i- Independence (in situations that involve elements of chance)search for term
The property that an outcome of one trial of a situation involving elements of chance or a probability activity has no effect or influence on an outcome of any other trial of that situation or activity.
Curriculum achievement objectives references
Probability: Levels 4, (5), (6), (7), (8)- Independent eventssearch for term
Events that have no influence on each other.
Two events are independent if the fact that one of the events has occurred has no influence on the probability of the other event occurring.
If events A and B are independent then:
P(A | B) = P(A)
P(B | A) = P(B)
P(A and B) = P(A) P(B), where P(E) represents the probability of event E occurring.For two events A and B:
If P(A | B)= P(A) then A and B are independent events
If P(B | A)= P(B) then A and B are independent events
If P(A and B) = P(A) P(B) then A and B are independent events.Curriculum achievement objectives reference
Probability: Level 8- Independent variablesearch for term
A common alternative term for the explanatory variable in bivariate data.
Alternatives: explanatory variable, input variable, predictor variable
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Index numbersearch for term
A number showing the size of a quantity relative to its size at a chosen period, called the base period.
The price index for a certain ‘basket’ of shares, goods or services aims to show how the price has changed while the quantities in the basket remain fixed. The index at the base period is a convenient number such as 100 (or 1000). An index greater than 100 (or 1000) at a later time period indicates that the basket has increased in value or price relative to that at the base period.
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Inferencesearch for term
-
Curriculum achievement objectives references
Statistical investigation: Levels 6, 7, 8 - Integerssearch for term
The set of integers is an extension of the set of whole numbers to include the negatives of the whole numbers. Thus it is {…, -3, -2, -1, 0, 1, 2, 3, …} The integers give us answers to questions such as :"What can I add to 5 to get 3?".
- Integrationsearch for term
Integration is the reverse process of differentiation. The task with integration of a function f (x) is to find a function F(x) such that the derivative of F(x) is f(x) for all x in the domain of f. That is, F(x) is the integral of f(x) if and only if F’ (x) = f (x) for all x in the domain of f. Then we write F(x) = ∫f (x) dx which reads as F (x) is the integral of f (x) with respect to x. For example,
If f (x) = 3x2+2x+5 then F (x) = ∫f (x) dx = x 3+x2+5x +C, where C is an arbitrary constant.
Such an integral is known as an
indefinite integral, or an antiderivative. Differentiation of F(x) gives f(x).
An integral F(x) which is defined between certain limit values a and b of x is a definite integral.
Geometrically, a definite integral can be thought of as the area contained between the graph of a function and the x-axis and between the lines x=a and x=b. Areas located above the x-axis count as positive in integration, areas below the x-axis as negative.
If the indefinite integral cannot be found it may still be possible to find the numerical value of the definite integral using numerical integration methods such as the use of the trapezoidal rule and Simpson’s rule. This is also referred to as approximate integration.- Interpolationsearch for term
The process of estimating the value of one variable based on knowing the value of the other variable, where the known value is within the range of values of that variable for the data on which the estimation is based.
Curriculum achievement objectives references
Statistical investigation: Levels 7, (8)- Interquartile rangesearch for term
A measure of spread for a distribution of a numerical variable which is the width of an interval that contains the middle 50% (approximately) of the values in the distribution. It is calculated as the difference between the upper quartile and lower quartile of a distribution.
It is recommended that, for small data sets, this measure of spread is calculated by sorting the values into order or displaying them on a suitable plot and then counting values to find the quartiles, and to use software for large data sets.
The interquartile range is a stable measure of spread in that it is not influenced by unusually large or unusually small values. The interquartile range is more useful as a measure of spread than the range because of this stability. It is recommended that a graph of the distribution is used to check the appropriateness of the interquartile range as a measure of spread and to emphasise its meaning as a feature of the distribution.
Example
The maximum temperatures, in degrees Celsius (°C), in Rolleston for the first 10 days in November 2008 were: 18.6, 19.9, 20.6, 19.4, 17.8, 18.1, 17.8, 18.7, 19.6, 18.8
Ordered values: 17.8, 17.8, 18.1, 18.6, 18.7, 18.8, 19.4, 19.6, 19.9, 20.6
The median is the mean of the two central values, 18.7 and 18.8. Median = 18.75°C
The values in the ‘lower half’ are 17.8, 17.8, 18.1, 18.6, 18.7. Their median is 18.1. The lower quartile is 18.1°C.
The values in the ‘upper half’ are 18.8, 19.4, 19.6, 19.9, 20.6. Their median is 19.6. The upper quartile is 19.6°C.
The interquartile range is 19.6°C – 18.1°C = 1.5°C
The data and the interquartile range are displayed on the dot plot below.
See: lower quartile, measure of spread, quartiles, upper quartileCurriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)- Interval estimatesearch for term
A range of numbers, calculated from a random sample taken from the population, of which any number in the range is a possible value for a population parameter.
Example
A 95% confidence interval for a population mean is an interval estimate.
See: estimate
Curriculum achievement objectives reference
Statistical investigation: Level 8- Invariantsearch for term
Unaltered or unchanged.
- Invariant properties of transformationssearch for term
A point is invariant under a transformation in the plane or in space if the transformation leaves it unaltered. A rotation in the plane has one invariant point (the centre of rotation) and a rotation in space has a line of invariant points (the axis of rotation). A reflection in the plane has a line of invariant points (the mirror line or line of reflective symmetry) and a reflection in space has a plane of invariant points (the plane of reflective symmetry). A translation in the plane or in space has no invariant points, since by the definition of translation all points move the same distance. An enlargement has one invariant point (the centre of enlargement).
- Inverse of a functionsearch for term
Suppose we have a function y = f(x) which maps values of x to values of y. If there is a function g(y) that maps the values of y back to values of x (i.e. g(y) = x) then g is said to be the inverse of f. The inverse of f is written as f-1
So g = f-1
So we have f-1 [f (x)] = x
To have an inverse, f must be one-to-one, that is each value in the domain of f must be associated with just one value in the range of f. The domain of f-1 is the range of f, and the range of f-1 is the domain of f.- Investigationsearch for term
See: statistical investigation
Curriculum achievement objectives references
Statistical investigation: All levels
Statistical literacy: Levels 1, 2, 3, 4, 5- Irrational numberssearch for term
Real numbers that have no fractional form (that is, cannot be written in the form a/b where a and b are integers) are called irrational numbers. For example, √2 is an irrational number and therefore, like all irrational numbers:
- Has no fractional representation, and
- Its decimal form is non-repeating and has no pattern.
To 50 decimal places:
√2 = 1.41421356237309504880168872420969807856967187537694- Irregular component (for time-series data)search for term
The other variations in time-series data that are not identified as part of the trend component, cyclical component or seasonal component. They mostly consist of variations that don’t have a clear pattern.
Alternative: random error componentSee: time-series data
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Isometric transformationsearch for term
An isometric transformation (or isometry) is a shape-preserving transformation (movement) in the plane or in space. The isometric transformations are reflection, rotation and translation and combinations of them such as the glide, which is the combination of a translation and a reflection.
l- Language of direction and distancesearch for term
- Least common multiplesearch for term
See Common multiple.
- Least-squares regression linesearch for term
The most common method of choosing the line that best summarises the linear relationship (or linear trend) between the two variables in a linear regression analysis, from the bivariate data collected.
Of the many lines that could usefully summarise the linear relationship, the least-squares regression line is the one line with the smallest sum of the squares of the residuals.
Two other properties of the least-squares regression line are:
1. The sum of the residuals is zero.
2. The point with x-coordinate equal to the mean of the x-coordinates of the observations and with y-coordinate equal to the mean of the y-coordinates of the observations is always on the least-squares regression line.Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Lengthsearch for term
Length is the concept of distance in a straight line between two points on a line, in a plane, or in space. It is a measure of one dimension, measuring the size of a line. The basic SI unit of measurement of length is the metre, with millimetre, centimetre and kilometre also being commonly used units. (See SI measurement units)
- Likelihoodsearch for term
The notion of an outcome being probable. Likelihood is sometimes used as a simpler alternative to probability.
In a situation involving elements of chance, equal likelihoods mean that, for any trial, each outcome has the same chance of occurring.
Similarly, different likelihoods mean that, for any trial, not all of the outcomes have the same chance of occurring.
Curriculum achievement objectives reference
Probability: Level 2- Limit of a functionsearch for term
Consider the function f(x) = x2. As x tends towards 2, x2 tends towards 4. It is a case of closeness of x to 2 forces closeness of x2 to 4. So the limit of f(x) as x tends to 2 is 4. This is written as:
lim f(x) = 4 x → a
As x tends to 2 from the left, (that is, x is approaching 2 and is less than 2), f (x) tends to 4, and as x tends to 2 from the right, (that is, x is approaching 2 and is greater than 2), f(x) tends to 4. The limit as x tends to a of f(x) exists if and only if the limit as x tends to a from the left exists and the limit as x tends to a from the right exists and both of these limits are equal.
The explanation of the limit as implying closeness is illustrated by the function f(x) = (x2-1)/(x-1) x ≠ 1 So f(x) = x+1 for all values of x other than x=1 and its graph is the graph of f(x) = x+1 with a hole at x = 1. Although f (x) is undefined at x=1 we can still say that the limit of f(x) as x tends to 1 is 2 since closeness of x to 1 forces closeness of f(x) to 2. Put another way, f (x) can be made as close to 2 as we wish simply by choosing a value of x sufficiently close to 1.- Line graphsearch for term
A graph, often used for displaying time-series data, in which a series of points representing individual observations are connected by line segments.
Line graphs are useful for showing changes in a variable over time.
Example
Daily sales, in thousands of dollars, for a hardware store were recorded for 28 days. These data are displayed on the line graph below.
Curriculum achievement objectives references
Statistical investigation: Levels (3), (4), (5), (6), (7), (8)- Linear equationssearch for term
The simplest linear equation is one which expresses a direct linear relationship between two variables. For example, if sticky bars cost $3-00 each then the cost (c) in dollars of buying n of them is given by the equation c=3n. This is called a first degree or linear equation. If the associated values of c and n are graphed on a coordinate plane the points that satisfy this equation will lie on a straight line. Suppose there is a packaging fee of $2-00. Then c=3n+2. We could now ask questions such as:" If the cost was $23-00, how many sticky bars did we buy?" Note that in a linear equation the variables (such as c and n above) are not raised to higher powers such as squares or cubes.
So a linear equation is an equation of the form y=ax+b where a and b are real numbers. The solution, or root, of the equation ax+b=0 is x= -b/a.- Linear inequalitiessearch for term
A linear inequality is an expression of the form
ax + b > c The relations >,<, ≥, and ≤ give rise to inequalities.
2x + 3 > 5 is a linear inequality because x is raised only to the power of 1. To solve this inequality we can use the rules of algebra, with a little caution.
2x + 3 > 5 Adding -3 to both sides we obtain
2x > 2 Dividing both sides by 2 we obtain
x > 1 So the solution is the set of all real numbers greater than 1.
Caution has to be exercised in handling inequalities. Two things in particular are worth mentioning:- Changing the unknown to the other side of the inequality changes the sign. For example: If 2
2 - Multiplying both sides of the inequality by a negative number also changes the sign. For example: If -x>3 then x<-3.
An examination of the number line helps with an understanding of these two properties.
- Changing the unknown to the other side of the inequality changes the sign. For example: If 2
- Linear programmingsearch for term
A problem of linear programming is that of finding nonnegative values of a number of variables for which a certain linear function of those variables assumes the greatest (or the least) possible value while subject to certain linear constraints.
Graphing techniques can often be used to find the maximum or minimum values.- Linear proportionsearch for term
Any situation that can be modelled using the equation a/b = c/d involves a linear proportion. Reasoning with linear proportions involves part-whole relationships (equivalent fractions), operating on fractions, measurement, rates and ratios, and division with remainders. It also involves competence with connecting fractions, decimals and percentages and using graphs to solve problems. Linear proportions apply in a wide range of contexts including trigonometry, probability, metric measurement conversions, calculating best deals, and physical rates such as speed. (See also rate and ratio)
- Linear regression of bivariate datasearch for term
A form of statistical analysis that uses bivariate data (where both are numerical variables) to examine how knowledge of one of the variables (the explanatory variable) provides information about the values of the other variable (the response variable). The roles of the explanatory and response variables are therefore different.
When the bivariate numerical data are displayed on a scatter plot, the relationship between the two variables becomes visible. Linear regression fits a straight line to the data that is added to the scatter plot. The fitted line helps to show whether or not a linear regression model is a good fit to the data.
If a linear regression model is appropriate then the fitted line (regression line) is used to predict a value of the response variable for a given value of the explanatory variable and to describe how the values of the response variable change, on average, as the values of the explanatory variable change.
An appropriately fitted linear regression model estimates the true, but unknown, linear relationship between the two variables and the underlying system the data was taken from is regarded as having two components: trend (the general linear tendency) and scatter (variation from the trend).
Note: Linear regression can be used when there is more than one explanatory variable, but at Level Eight only one explanatory variable is used. When there is one explanatory variable the method is called simple linear regression.
Curriculum achievement objectives reference
Statistical investigation: Level 8- Linear scalessearch for term
Drawings, maps and models are scale representations and that scale is a linear (adjective of line) scale if the linear dimensions of the scale representation are in direct proportion to the linear measurements of the region being represented.
For example, a map of a playground might be drawn to a scale of one centimetre to one metre (that is, 1cm on the map represents 1m on the ground). As a ratio, this is a scale of 1:100, so every distance on the map is one one-hundredth of the distance it represents on the ground.- Locationsearch for term
Location refers to the position of an object (or a point) on a line, on a plane or in space. The location of a point on a line can be defined by its distance from a fixed origin. The location of a point in a plane can be described by an ordered pair, and the location of a point in space can be described by an ordered triple. (See Coordinate system) Location can also be described by direction and distance, such as a compass bearing and a distance from a fixed point.
- Locus (plural loci)search for term
A locus is a geometric figure for which all points satisfy a given condition. It is the set of points and only those points that satisfy the condition. So the locus of a point that is three centimetres from a given point P is a circle of radius 3cm whose centre is at P. The locus of a point that is equidistant from two given intersecting lines is the bisector of the angles formed by the lines. The locus of a point equidistant from two given parallel lines is a line parallel to the two lines and midway between them.
- Log modellingsearch for term
In curve fitting processes it is helpful to obtain scatter diagrams of transformed variables in order to decide which type of curve should be used. For example, if a scatter diagram of log y against x shows a linear relationship then the equation has the form y = abx or log y = log a + (log b) x while if log y versus log x shows a linear relationship the equation has the form y = axb or log y = log a + b log x In order to see these relationships, special graph paper for which one or both scales are calibrated logarithmically can be used. These are referred to as semi-log or log-log graph paper respectively.
- Logarithmic algebraic expressionssearch for term
The logarithm function is the inverse of the exponential function. The logarithm function is defined, when b is positive and b ≠ 1, as y = logbx. y is called the logarithm of x to the base b. If y is the logarithm of x to the base b, then by = x. For example:
102 = 100, so log10 100 = 2
103 = 1,000, so log10 1,000 = 3
105 = 100,000 so log10 100,000 = 5
From this we can see that the two fundamental properties of the logarithm can be derived from the corresponding laws of exponents (See Powers). They are: (i) The logarithm of a product is equal to the sum of the logarithms of the factors.
So logb xy = logb x + logb y (ii) The logarithm of a quotient is equal to the logarithm of the numerator minus the logarithm of the denominator.
So logb x/y = logb x - logb y From (i) above it follows that the logarithm of the power of a number is equal to the power times the logarithm of the number. That is:
logb xp = p logb x
The logarithm of 1 is 0 since b0 = 1
There is a number e called the base of the natural logarithm. Equations involving growth and decay are best written in terms of e, and logarithms to the base e are called natural logarithms. e is an irrational number, its decimal expansion to 10 places of decimals being 2.7182818285.
It can be seen how this number occurs naturally in situations of growth and decay. For example, the formula for compounding interest is An= P (1 + r)n. Suppose we set interest at 100%. Then r = 1. After one year, A1 = P (1+1) = 2P
If we decide to modify the system of accumulation to 50% paid twice a year we have A1 = P (1+1/2)2
If we decide to modify the system of accumulation to 25% paid four times a year we have A1 = P (1+1/4 )4
Continuing in this way we can see that as we move towards continuous growth, that is as n tends to infinity with r = 1/n, the formula for the amount after one year becomes:
A1 = lim P (1+1/n)n n→∞
The limit as n tends to infinity of (1+1)n/n is called e.
To ten decimal places e = 2.7182818285 Putting some values for n into the expression (1+1)n/n shows this to be a ‘reasonable’ value.
Growth (or decay) functions where the growth (or decay) is continuous can be written in the form y = A ekt where A is the initial size, t is the time and k is the nominal rate of growth. Compare this with the compound interest formula: Suppose we invest $100-00 for two years at a rate of 10% paid annually. Then A2 = 100 (1+0.1)2 = 121.
If we invested $100-00 for two years at a rate of 10% per annum paid continuously we would have: A2 = 100 e0.2 = 122.14.- Lower quartilesearch for term
See: quartiles
Curriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)m- Mapssearch for term
A map is a scale drawing, a projection of a portion of our three-dimensional world onto a plane surface such as a piece of paper. Maps can be simple, such as scale drawings of the classroom, or more complex, involving compass directions and coordinate grid systems.
- Margins of errorsearch for term
A number calculated from a random sample that estimates the likely size of the sampling error in an estimate of a population parameter.
The margin of error is added to and subtracted from a point estimate of a population parameter to form a confidence interval for a population parameter, usually with a confidence level of 95%. The margin of error is therefore half of the width of a confidence interval.
Generally a larger sample size will give a smaller margin of error.
The higher the confidence level, the greater the margin of error.
Curriculum achievement objectives references
Statistical investigation: Level 8
Statistical literacy: Level 8- Masssearch for term
The mass of an object is a measure of the amount of matter in it. It represents in a quantitative way that property of matter that is described qualitatively as inertia. While an object on the Moon would have about one-sixth of the weight that it would have on Earth, its mass would be unchanged. The common SI units of mass are the kilogram (kg), milligram (mg), gram (g) and tonne.
- Meansearch for term
A measure of centre for a distribution of a numerical variable. The mean is the centre of mass of the values in a distribution and is calculated by adding the values and then dividing this total by the number of values.
For large data sets it is recommended that a calculator or software is used to calculate the mean.
The mean can be influenced by unusually large or unusually small values. It is recommended that a graph of the distribution is used to check the appropriateness of the mean as a measure of centre and to emphasise its meaning as a feature of the distribution.
ExampleThe maximum temperatures, in degrees Celsius (°C), in Rolleston for the first 10 days in November 2008 were: 18.6, 19.9, 20.6, 19.4, 17.8, 18.1, 17.8, 18.7, 19.6, 18.8
The mean maximum temperature over these 10 days is 18.93°C.
The data and the mean are displayed on the dot plot below.

Alternative: arithmetic meanSee: measure of centre, population mean, sample mean
Curriculum achievement objectives references
Statistical investigation: Levels 5, (6), (7), 8- Measuresearch for term
An amount or quantity that is determined by measurement or calculation. The term ‘measure’ is used in two different ways in the Curriculum.
One use is in the terms; measure of centre, measure of spread and measure of proportion, where these measures are calculated quantities that represent characteristics of a distribution. The use of “using displays and measures” in the Level Six (statistical investigation thread) achievement objective is a reference to measures of centre, spread and proportion.
The other use applies to a statistical investigation. The investigator decides on a subject of interest and then decides the aspects of it that can be observed. These aspects are the ‘measures’.
Example
An investigator decides that ‘well-being’ is a subject of interest and chooses ‘happiness’ to be one aspect of well-being. Happiness could be measured by the variable ‘number of times a person laughs in a day, on average’.
Curriculum achievement objectives references
Statistical investigation: Levels 5, 6, 7, (8)
Statistical literacy: Levels 5, (6), (7), (8)- Measure of centresearch for term
A number that is representative or typical of the middle of a distribution of a numerical variable. The measures of centre that are used most often are the mean and the median. The mode is sometimes used.
Alternatives: measure of centrality, measure of central tendency, measure of location
See: average
Curriculum achievement objectives references
Statistical investigation: Levels 5, (6), (7), (8)- Measure of proportionsearch for term
A sample proportion used to make comparisons among sample distributions.
Example
An online questionnaire was completed by 727 students enrolled in an introductory Statistics course at the University of Auckland. It included questions on their actual weight, gender and ethnicity.
The measurement variable ‘actual weight’ was recategorised with one category for actual weights less than 60kg. It was concluded that 56.7% of the females weighed less than 60kg compared to 7.6% of the males. This is an example of bivariate data with one measurement variable (actual weight) and one category variable (gender).
As part of a comparison between the ethnicity sample distributions for females and males it was concluded that 5.4% of the females were Korean compared to 10.9% of the males. This is an example of bivariate data with two category variables.
Curriculum achievement objectives references
Statistical investigation: Levels 5, (6), (7), (8)- Measurement datasearch for term
Data in which the values result from measuring, meaning that the values may take on any value within an interval of numbers.
Example
The heights of a class of Year 9 students.
See: numerical data, quantitative data
Curriculum achievement objectives references
Statistical investigation: Level 4, (5), (6), (7), (8)- Measurement variablesearch for term
A property that may have different values for different individuals and for which these values result from measuring, meaning that the values may take on any value within an interval of numbers.
Example
The heights of a class of Year 9 students.See: numerical variable, quantitative variable
Curriculum achievement objectives references
Statistical investigation: Levels (4), (5), (6), (7), (8)- Measures of Central Tendencysearch for term
Also referred to as measures of central location, the measures of central tendency are those statistics that describe the centre or the most typical value of a set of data. They might often be loosely described as averages in the sense that they are indicative of the middle of a set of data. Most common amongst these measures is the arithmetic mean, usually referred to as the mean. The arithmetic mean of a set of n numbers is found by taking the sum of the numbers and dividing that sum by n. Other measures of central tendency are the median (the middle value when the numbers are ordered), and the mode (the most commonly occurring value). The mode may not exit, and if it does it may not be unique.
- Measures of spreadsearch for term
A number that conveys the degree to which values in a distribution of a numerical variable differ from each other. The measures of spread that are used most often are: interquartile range, range, standard deviation, variance.
Alternatives: measure of variability, measure of dispersion
Curriculum achievement objectives references
Statistical investigation: Levels 5, (6), (7), (8)- Mediansearch for term
A measure of centre that marks the middle of a distribution of a numerical variable.
It is recommended that, for small data sets, this measure of centre is calculated by sorting the values into order and then counting the values, and to use software for large data sets.
The median is a stable measure of centre in that it is not influenced by unusually large or unusually small values. It is recommended that a graph of the distribution is used to emphasise its meaning as a feature of the distribution.
Example 1 (Odd number of values)
The maximum temperatures, in degrees Celsius (°C), in Rolleston for the first 9 days in November 2008 were: 18.6, 19.9, 20.6, 19.4, 17.8, 18.1, 17.8, 18.7, 19.6
Ordered values: 17.8, 17.8, 18.1, 18.6, 18.7, 19.4, 19.6, 19.9, 20.6
The data and the median are displayed on the dot plot below.

The median maximum temperature over these 9 days is 18.7°C. There are 4 values below 18.7°C and 4 values above it.Example 2 (Even number of values)
The maximum temperatures, in degrees Celsius (°C), in Rolleston for the first 10 days in November 2008 were: 18.6, 19.9, 20.6, 19.4, 17.8, 18.1, 17.8, 18.7, 19.6, 18.8
Ordered values: 17.8, 17.8, 18.1, 18.6, 18.7, 18.8, 19.4, 19.6, 19.9, 20.6
The mean of the two central values, 18.7 and 18.8, is 18.75.
The data and the median are displayed on the dot plot below.
The median maximum temperature over these 10 days is 18.75°C. There are 5 values below 18.75°C and 5 values above it.
Note: The median can be calculated directly from the dot plot or from the ordered values.
See: measure of centre
Curriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)- Metric Systemsearch for term
See SI.
- Modal intervalsearch for term
An interval of neighbouring values for a measurement variable that occur noticeably more often than the values on each side of this interval.
Example
The number of hours of sunshine per week in Grey Lynn, Auckland, from Monday 2 January 2006 to Sunday 31 December 2006 is displayed in the histogram below.
The distribution has a modal interval of 25 to 30 hours of sunshine per week.Curriculum achievement objectives references
Statistical investigation: Levels (4), (5), (6), (7), (8)- Modalitysearch for term
A measure of the number of modes in a distribution of a numerical variable.
A unimodal distribution has one mode, meaning that the distribution has one value (or interval of neighbouring values) that occurs noticeably more often than any other value (or values on each side of the modal interval).
A bimodal distribution has two modes, meaning that the distribution has two values (or intervals of neighbouring values) that occur noticeably more often than the values on each side of the modes (or modal intervals).
In frequency distributions of a numerical variable the word ‘cluster’ is often used to describe groups of neighbouring values that form modes or modal intervals.
Example 1 (Frequency distribution, whole-number variable)The number of days in a week that rain fell in Grey Lynn, Auckland, from Monday 2 January 2006 to Sunday 31 December 2006 is recorded in the frequency table and displayed in the bar graph below.
Number of days with rain Number of weeks 0
1
2
3
4
5
6
72
5
5
5
19
6
6
4Total 52 This distribution is unimodal with a mode at 4 days of rain per week.
Example 2 (Theoretical distribution, continuous random variable)
The graph displays the probability density function of a theoretical distribution. It has modes at 40 and 70.
Curriculum achievement objectives references
Statistical investigation: Levels (6), (7), (8)- Modesearch for term
A value in a distribution of a numerical variable that occurs more frequently than other values.
As a measure of centre the mode is less useful than the mean or median because some distributions have more than one mode and other distributions, where no values are repeated, have no mode.
It is recommended that a graph of the distribution is used to check the appropriateness of the mode as a measure of centre and to emphasise its meaning as a feature of the distribution.
ExampleThe number of days in a week that rain fell in Grey Lynn, Auckland, from Monday 2 January 2006 to Sunday 31 December 2006 is recorded in the frequency table and displayed on the bar graph below.
Number of days with rain Number of weeks 0
1
2
3
4
5
6
72
5
5
5
19
6
6
4Total 52 The mode is 4 days with rain per week.
Curriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)- Modelsearch for term
A simplified or idealised description of a situation. The term model is used in two different ways in the Curriculum.
In the probability thread the use of “models of all the outcomes” refers to a list of all possible outcomes of a situation involving elements of chance and, at more advanced levels, a list of all possible outcomes and the corresponding probabilities for each outcome.
At Level Eight in the statistical investigation thread, an achievement objective refers to “appropriate models (including linear regression for bivariate data and additive models for time-series data)”. Used in this way, a model is an idealised description of the underlying system the data was taken from and the model is intended to match the data closely.
See: probability function (for a discrete random variable)
Curriculum achievement objectives references
Statistical investigation: Level 8
Probability: Levels 3, 4, (5), (6), (7), (8)- Moving averagesearch for term
A method used to smooth time-series data. It forms a new smoothed series in which the irregular component is reduced.
If the time series has a seasonal component a moving average is used to eliminate the seasonal component.
Each value in the time series is replaced by an average of the value and a number of neighbouring values. The number of values used to calculate a moving average depends on the type of time-series data. For weekly data, seven values are used; for monthly data, 12 values are used; and for quarterly data, four values are used. If the number of values used is even, the moving average must be centred by taking a two-term moving average of the new series.
In terms of an additive model for time-series data, Y = T + S + C + I, where
T represents the trend component,
S represents the seasonal component,
C represents the cyclical component, and
I represents the irregular component;
the smoothed series = T + C.See: centred moving average, moving mean
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Moving meansearch for term
A specified moving average method used to smooth time-series data. It forms a new smoothed series in which the irregular component is reduced.
If the time series has a seasonal component a moving mean may be used to eliminate the seasonal component.
Each value in the time series is replaced by the mean of the value and a number of neighbouring values. The number of values used to calculate a moving mean depends on the type of time-series data. For weekly data, seven values are used; for monthly data, 12 values are used; and for quarterly data, four values are used. If the number of values used is even, the moving mean must be centred by taking two-term moving means of each pair of consecutive moving means, forming a series of centred moving means. See Example 2 for an illustration of this technique.
In terms of an additive model for time-series data, Y = T + S + C + I, where
T represents the trend component,
S represents the seasonal component,
C represents the cyclical component, and
I represents the irregular component;
the smoothed series = T + C.
Example 1 (Weekly data)Daily sales, in thousands of dollars, for a hardware store were recorded for 21 days. There is reasonably systematic variation over each 7-day period and so moving means of order 7 have been calculated to attempt to eliminate this seasonal component. The moving mean for the first Thursday is calculated by
=148.14
Day Sales
($000)Moving mean
($000)Mon
Tue
Wed
Thu
Fri
Sat
Sun
Mon
Tue
Wed
Thu
Fri
Sat
Sun
Mon
Tue
Wed
Thu
Fri
Sat
Sun86
125
115
150
168
291
102
83
118
112
141
171
282
99
82
117
108
155
165
271
88148.14
147.71
146.71
146.29
145.00
145.43
144.14
143.71
143.57
143.43
142.86
144.86
144.00
142.43
140.86
The raw data and the moving means are displayed below.
Example 2 (Quarterly data)
Statistics New Zealand’s Economic Survey of Manufacturing provided the following data on actual operating income for the manufacturing sector in New Zealand. There is reasonably systematic variation over each 4-quarter period and so moving means of order 4 have been calculated to attempt to eliminate this seasonal component. However these moving means do not align with the quarters; the moving means are not centred. To align the moving means with the quarters, each pair of moving means is averaged to form centred moving means.
The first moving mean (between Mar-05 and Dec-05) is calculated by
= 17531
The centred moving mean for Sep-05 is calculated by= 17548.25
Quarter Operating
Income
($millions)Moving
mean
($millions)Centred
moving mean
($millions)Mar-05
Jun-05
Sep-05
Dec-05
Mar-06
Jun-06
Sep-06
Dec-06
Mar-07
Jun-07
Sep-07
Dec-0717322
17696
17060
18046
17460
19034
18245
18866
18174
19464
18633
20616
17531.00
17565.50
17900.00
18196.25
18401.25
18579.75
18687.25
18784.25
19221.75
17548.250
17732.750
18048.125
18298.750
18490.500
18633.500
18735.750
19003.000
The raw data and the centred moving means are displayed below. Note that M, J, S and D indicate quarter years ending in March, June, September and December respectively.
See: moving average
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Multiplesearch for term
An integer a is a multiple of an integer b if and only if a = m x b for some integer m. So, for example, the positive multiples of 5 are 5, 10, 15, 20, 25, …Obviously, the set of multiples of any number other than zero is infinite. (See also Common multiple)
- Multiple transformationssearch for term
The term multiple transformations refers to the composition of two or more transformations. Suppose R is a transformation of rotation in the plane about the origin through 90o (anticlockwise); and M represents a reflection in the plane in the x-axis. What is the effect of the multiple transformation R followed by M? The multiple transformation R followed by M is written as M
R. We can find the effect of M
R by considering its effect on the two points (1,0) and (0,1). That is because those two points ‘represent’ the x-axis and the y-axis respectively and so every point in the plane can be written as a combination of those two points. M
R maps (1,0) to (0, -1) and (0,1) to (-1, 0). So M
R is the same as a reflection in the line y = - x. Note that transformation composition is not commutative since M
R is not the same as R
M.
It is helpful in considering the effect of multiple transformations to classify transformations as direct or indirect. The direct isometries are rotation and translation. They are called direct because they do not flip (or turn over) the shape being transformed. Reflection and glide reflection are indirect isometries because they do flip the shape being transformed. So, for example, the product of two reflections is a direct isometry and is therefore either a rotation or a translation. In fact the Fundamental Isometry Theorem assures us that every isometry in the plane can be expressed as a product of at most three reflections.- Multiplication ofsearch for term
- Whole numbers: Multiplication may be described as repeated addition of the same number. For example, 5x7 means 7+7+7+7+7 or alternatively, 5+5+5+5+5+5+5. When the basic multiplication facts are known, more complex multiplication problems can be answered using partitioning strategies. For example, 8 x 15 can be seen as 8x10 plus a half of 8x10 since 5 is a half of 10. So 8x15=120. Similarly, to find 7x19 take 7x20, which is 140, and subtract 7 to get 133.
Multiplication is a binary operation, that is, it is an operation on two numbers.
Multiplication is commutative, that is, the order of the numbers does not change the answer. For example, 4x5 = 5x4.
Multiplication is associative, that is, the grouping of the numbers does not affect the answer. For example, (2x3) x 5 = 2 x (3x5).
1 is the identity element for multiplication, because multiplication by one does not change a number.
Multiplication is distributive over addition, that is,
a x (b+c) = (a x b) + (a x c)
For example, 3x(4+7) = (3x4)+(3x7) This property of multiplication is most useful when operating beyond the basic facts of multiplication. For example, 8x56 can be calculated as (8x50) + (8x6) which equals 400+48 or 448 - Fractions: Fractions may be multiplied as follows: a/b x c/d = (a x c)/ (b x d). For example, 2/3 x 4/7 = 8/21. This rule is best discovered by students, using grids or folded paper, and then observing the pattern of results.
- Decimals: Decimal fractions (commonly called decimals) may be multiplied in the same way that whole numbers are, with care being taken to consider the position of the digits in the decimal. So, for example, to find the result of 1.5 x 0.8, multiply 15 by 8 to obtain 120 and realise by estimation that the answer must be 1.2. Alternatively, one could multiply 15/10 by 8/10 to obtain 120/100, which is 1.2.
- Percentages: There is a danger in looking for the answer to problems such as 20% x 40% in that it would be tempting to give the answer of 800% since 20 x 40 = 800. The easiest way to multiply the percentages is to change them to decimals. Then 20% x 40% = 0.2 x 0.4 = 0.08 = 8%. Alternatively, one could multiply 2/10 by 4/10 to obtain 8/100, which is 8%.
- Integers: Integers may be multiplied in the following way:
a x-b = -(a x b) e.g. 2 x -3 = -6
-a x b = - (a x b) e.g. -2 x 3 = -6
These results are best discovered using models, such as a black-and-white counters model, in which a white counter represents one, and a black counter represents -1.
-a x -b = a x b e.g. -2 x -3 = 6
This result is difficult to model but can be demonstrated as follows:
(3 + -3) = 0 so -2 x (3 + -3) = -2 x 0 = 0
Also, -2 x (3 + -3) = (-2 x 3) + (-2 x -3)
So (-2 x 3) + (-2 x -3) = 0
But (-2 x 3) = -6
So -2 x -3 = 6
Alternatively, one can take an inductive approach and look for a pattern. For example, consider some of the multiples of negative three.
4 x -3 = -12
3 x -3 = -9
2 x -3 = -6
1 x -3 = -3
0 x -3 = 0
-1 x -3 = 3
-2 x -3 = 6
-3 x -3 = 9 etc.
From this, and other similar examples, it is apparent that the product of two negative numbers is a positive number.
The properties of multiplication outlined for whole number also apply to the multiplication of fractions, decimals, percentages and integers.
- Whole numbers: Multiplication may be described as repeated addition of the same number. For example, 5x7 means 7+7+7+7+7 or alternatively, 5+5+5+5+5+5+5. When the basic multiplication facts are known, more complex multiplication problems can be answered using partitioning strategies. For example, 8 x 15 can be seen as 8x10 plus a half of 8x10 since 5 is a half of 10. So 8x15=120. Similarly, to find 7x19 take 7x20, which is 140, and subtract 7 to get 133.
- Multiplicative strategiessearch for term
Multiplicative strategies are techniques used to solve multiplication problems from known facts. A multiplication strategy involves one or more of the properties of multiplication, specifically, commutativity, associativity, distributivity and inverse properties (See Multiplication). For example, 3x20 could be seen as 3x10x2 or 30x2, which is 60. At a harder level, 19x3 could be seen as (20x3)-3 which is 60-3 or 57.
Similarly, 315÷45 = 630÷90 = 63÷9 = 7.- Multivariate datasearch for term
A data set that has several variables.
Example
A data set consisting of the heights, ages, genders and eye colours of a class of Year 9 students.
Curriculum achievement objectives references
Statistical investigation: Levels 3, 4, 5, (6), (7), (8)- Mutually exclusive eventssearch for term
Events that cannot occur together.
If events A and B are mutually exclusive then the combined event A and B contains no outcomes.
Example 1Suppose we have a group of men and women, each of whom is a possible outcome of a probability activity. If A is the event that a person is aged less than 30 years and B is the event that a person is aged over 50.
The event A and B contains no outcomes because none of the people can be aged less than 30 years and over 50. Events A and B are therefore mutually exclusive.
Example 2
Consider rolling two dice. Suppose that event C consists of outcomes which have a total of 8 and that event D consists of outcomes which has the first die showing a 1.
First explanation: If the first die shows a 1 (event D has occurred) then the greatest total for the two dice is 1 + 6 = 7, meaning that a total of 8 cannot occur. In other words, event C cannot occur together with event D.
Second explanation: C consists of the outcomes (2, 6), (3, 5), (4, 4), (5, 3), (6, 2), where (2, 6) means a 2 on the first die and a 6 on the second. D consists of the outcomes (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6). No outcomes are common to both event C and event D.
Events C and D are therefore mutually exclusive.
Alternative: disjoint events
Curriculum achievement objectives reference
Probability: (Level 8)n- Natural numberssearch for term
The set of natural numbers is {1, 2, 3, 4, 5, …} They are often also referred to as the Counting numbers.
- Netssearch for term
A net consists of a set of connected polygons that can be folded to form a polyhedron
- Networksearch for term
In graph theory a graph consists of a set of points, called vertices, along with edges connecting the vertices. The edges express a relationship between the vertices. A graph may have no edges (the null graph), one or more edges, an edge connecting every vertex to every other vertex (the complete graph) or more than one edge between two vertices (a multigraph). A graph may have numerical values attached to each edge (a network), an arrowed direction on each edge (a directed graph, or digraph) or both numerical values and direction arrows (a directed network). In directed graphs the edges are called directed edges or arcs.
The positioning of the vertices in the plane is not usually significant as it is the edges that define the relationship between the vertices.
The following directed network shows the cost (in $m) of stormwater piping in Watersville.- Non-linear functionsearch for term
A linear function is a function of the kind f(x)=ax b, where a and b are real numbers. The graph of such a function on the Cartesian plane is a straight line. Other functions, such as polynomials of higher degree, are non-linear functions. Their graphs are not straight lines. For the example, a quadratic function equation is of the form f(x)=ax2+bx+c where a, b, and c are real numbers. The graph of a quadratic function is a parabola.
- Non-sampling errorsearch for term
One of the two reasons for the difference between an estimate (from a sample) and the true value of a population parameter; the other reason being the error caused because data are collected from a sample rather than the whole population (sampling error). Non-sampling errors have the potential to cause bias in surveys or samples.
There are many different types of non-sampling errors and the names used for each of them are not consistent.
Some examples of non-sampling errors are:
• The sampling process is such that a specific group is excluded or under-represented in the sample, deliberately or inadvertently. If the excluded or under-represented group is different, with respect to survey issues, then bias will occur.
• The sampling process allows individuals to select themselves. Individuals with strong opinions about the survey issues or those with substantial knowledge will tend to be over-represented, creating bias.
• If people who refuse to answer are different, with respect to survey issues, from those who respond then bias will occur. This can also happen with people who are never contacted and people who have yet to make up their mind.
• If the response rate (the proportion of the sample that takes part in a survey) is low, bias can occur because respondents may tend consistently to have views that are more extreme than those of the population in general.
• The wording of questions, the order in which they are asked and the number and type of options offered can influence survey results.
• Answers given by respondents do not always reflect their true beliefs because they may feel under social pressure not to give an unpopular or socially undesirable answer.
• Answers given by respondents may be influenced by the desire to impress an interviewer.
Curriculum achievement objectives references
Statistical investigation: Levels (7), (8)
Statistical literacy: Levels 7, (8)- Normal distributionsearch for term
A family of theoretical distributions that is useful as a model for some continuous random variables.
Each member of this family of distributions is uniquely identified by specifying the mean µ and standard deviation σ (or variance σ2). As such, µ and σ (or σ2), are the parameters of the normal distribution, and the distribution is sometimes written as Normal(µ, σ) or Normal(µ, σ2).
The probability density function of a normal distribution is a symmetrical, bell-shaped curve, centred at its mean µ. The graphs of the probability density functions of two normal distributions are shown below, one with µ = 50 and σ = 15 and the other with µ = 50 and σ = 10.
Curriculum achievement objectives references
Probability: Levels 7, 8- Numeralsearch for term
A name or symbol (or combination of symbols) that describes a number. So two, 2, 1+1, rua and 4/2 are all numerals for the same number.
- Numeration systemsearch for term
- Numeratorsearch for term
When a rational number is written as a fraction, that is in the form a/b, then a is called the numerator of the fraction. (See also Fraction)
- Numerical datasearch for term
Data in which the values result from counting or measuring. Measurement data are numerical, as are whole-number data.
Alternative: quantitative data
See: measurement data, whole-number data
Curriculum achievement objectives references
Statistical investigation: Levels 2, 3, (4), (5), (6), (7), (8)- Numerical variablesearch for term
A property that may have different values for different individuals and for which these values result from counting or measuring. Measurement variables are numerical, as are whole-number variables.
Alternative: quantitative variables
See: measurement variable, whole-number variable
Curriculum achievement objectives references
Statistical investigation: Levels (4), (5), (6), (7), (8)o- Observational studysearch for term
A study in which a researcher attempts to understand the effect that a variable (an explanatory variable) may have on some phenomenon (the response), but the researcher is not able to control some important conditions of the study.
In an observational study the researcher has no control over the value of the explanatory variable; the researcher can only observe the value of the explanatory variable for each individual and, if necessary, allocate individuals to groups based on the observed values.
Because the groups in the study are formed by values of an explanatory variable that individuals happened to receive, and not by randomisation, the groups may not be similar in all ways apart from the value of the explanatory variable.
Any observed differences in the response (if large enough) between the groups, on average, cannot be said to be caused by any differences in the values of the explanatory variable. The differences in the response could be due to the differences in the groups that are not related to the explanatory variable.
Example
A study by researchers at Harvard School of Public Health, published in 2009, investigated the relationship between low childhood IQ and adult mental health disorders. The study participants were a group of children born in 1972 and 1973 in Dunedin. Their IQs were assessed at ages 7, 9 and 11 and mental health disorders were assessed at ages 18 through to 32 in interviews by health professionals who had no knowledge of the individuals’ IQ or mental health history.
This is an observational study because the researchers had no control over the explanatory variable, childhood IQ. The researchers could only record the assessed childhood IQ. The response was whether or not the individual had suffered from a mental disorder during adulthood.
Curriculum achievement objectives reference
Statistical literacy: Level 8- One-way tablesearch for term
A table for displaying category data for one variable in a data set that displays each category and its associated frequency or relative frequency.
Example
Students enrolled in an introductory Statistics course at the University of Auckland were asked to complete an online questionnaire. One of the questions asked them to enter their ethnicity. They chose from the following list: Chinese, Indian, Korean, Maori, New Zealand European, Other European, Pacific, Other. The 727 responses are displayed on the one-way table below.
Ethnicity Chinese Indian Korean Maori NZ European Other European Pacific Other Total Frequency 169 58 56 18 253 45 38 90 727 Curriculum achievement objectives references
Statistical investigation: Levels (4), (5), (6), (7), (8)
Probability: Levels (3), (4), (5), (6)- Operationssearch for term
Addition, subtraction, multiplication, and division are examples of binary operations on numbers. They are called binary operations because they use a rule to map two numbers to one number (the answer). For example, if the two numbers are 4 and 5 and the binary operation is addition then the answer is 9. Note that the one restriction on these operations is that division by zero is not possible. For example, 7÷0 has no answer, since it asks the question: "What do I multiply zero by to get 7?" Other types of operations are possible, such as finding the square root of a number, which is a unary operation since it requires only one number to be operated on. (See also Addition, Subtraction, Multiplication, Division)
- Optimal solutions using numerical approachessearch for term
Students who are not at the stage of using calculus techniques to find maxima and minima may still be able to solve simple practical problems using graphs, charts, estimation techniques etc. For example, to find the maximum area of a rectangle that has a perimeter of length 20 metres, students could set up a table as follows to find the answer. They could then draw a graph of that.
Height (m) 0 1 2 3 4 5 6 7 8 9 10 Width (m) 10 9 8 7 6 5 4 3 2 1 0 Area (m2) 0 9 16 21 24 25 24 21 16 9 0 - Order (of a moving average)search for term
The number of values used to calculate a moving average.
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Orderingsearch for term
Arranging according to some chosen attribute. Ordering involves the relations ‘less than’ and ‘greater than’ as defined on numbers. For example, objects could be ordered by area where the measures of the areas, given as numbers, are used to order the areas.
- Ordering fractionssearch for term
To order two fractions, we can write them in equivalent fraction forms so that they have the same denominator and then compare the numerators. For example, to order the fractions 2/3 and 5/8 we could write them in equivalent fraction forms with the same denominator. The least common multiple of 3 and 8 is 24. 2/3 = 16/24 and 5/8 = 15/24. So 5/8 < 2/3.
We can use mental strategies to order some pairs of fractions. For example, 7/12 > 1/2 because we know that 1/2 = 6/12 and 7>6. However, this apparently intuitive approach to ordering fractions is underpinned by mentally considering equivalent fraction forms.- Ordinal position of members of sequential patternssearch for term
See Sequential patterns.
- Outcomesearch for term
A possible result of a trial of a probability activity or a situation involving an element of chance.
Example 1
In a situation where a person will be selected and their eye colour recorded, blue is an outcome.
Example 2
In a situation where two dice will be rolled and the numbers on each die recorded, a 1 on the first die and a 4 on the second is an outcome.
Example 3
In a situation where a person will be selected at random from a population and their weight recorded, 76kg (to the nearest kilogram) is an outcome.
Curriculum achievement objectives references
Probability: Levels 1, (2), 3, 4, (5), (6), (7), (8)- Outliersearch for term
A member of a data set whose values for the variable in the data set are such that it lies well away from most of the other members of the data set.
Outliers may occur with one or more numerical variables and may only be apparent when variables in the data set are separated out into the categories for one or more category variables, as in Example 3 below.
An outlier may be genuine, indicating an individual of particular interest. Alternatively, an outlier may be the result of a mistake, indicating that it should be checked out and, if possible, corrected.
Example 1
The actual weights of a random sample of 20 male students enrolled in an introductory Statistics course at the University of Auckland are displayed on the dot plot below.
Example 2The actual weights and self-perceived ideal weights of a random sample of 20 male university enrolled in an introductory Statistics course at the University of Auckland are displayed on the scatter plot below.
Example 3
A random sample of 50 students enrolled in an introductory Statistics course at the University of Auckland was asked about the number of pairs of shoes they have. The side-by-side dot plots below displays the data for males and females.
The male with 18 pairs of shoes is not an outlier when the 50 values are displayed on a single dot plot but is an outlier when the data are displayed by the two categories of gender.Curriculum achievement objectives references
Statistical investigation: Levels (4), (5), (6), (7), (8)p- Parallelogramsearch for term
A parallelogram is a four-sided plane figure (quadrilateral) with two pairs of parallel sides. Thus a parallelogram is of the shape shown below.
A parallelogram has opposite sides equal.
The opposite angles of a parallelogram are equal
Two angles not opposite add to 180o
Note that rectangles, rhombuses and squares are all parallelograms.
A rectangle is a parallelogram with all angles equal.
A rhombus is a parallelogram with all sides equal.
A square is a parallelogram with all sides equal and all angles equal.
Therefore a square is both a rectangle and a rhombus.
The area of a parallelogram is found by multiplying the base length by the vertical height.For example, in the above parallelogram, the length of the base is 8cm and the vertical height is 5cm. So the area is 40 cm2.
This result may be explored and discovered by students using paper parallelograms and cutting off a triangle from one side of the parallelogram and joining it to the other side to form a rectangle.- Parametersearch for term
See: parameter(s) (of a theoretical distribution), population parameter
- Parameter(s) (of a theoretical distribution)search for term
A number (or numbers) occurring in the expression for a probability function or probability density function representing a theoretical distribution which, when changed, produce different members of the family of distributions. A given value for a parameter (or set of values for the parameters) determines a unique member of the family of distributions.
Example
The parameters of the binomial distribution are n and π.
The graphs of the probability functions of two binomial distributions are shown below, one with
n = 4 and π = 0.3 and the other with n = 4 and π = 0.6.
Curriculum achievement objectives references
Probability: Levels (7), (8)- Partitioningsearch for term
Partitioning is the process of ‘breaking up’ numbers, usually to simplify operations. For example, to compute 28 +37 we might partition 28 into 25+3 and see the problem as 25+40.
- Partitioning and combining like measuressearch for term
Measures of, for example, length can be added or subtracted providing they are in the same units. So a length of 31 metres can be broken into lengths of 8 metres and 5 metres.
- Pathsearch for term
Technically, a path is a sequence of connected vertices, none of which is repeated. Less specifically, the term ‘path’ may refer to a route given by bearing and direction or by a set of points such as a sequence of coordinates on a Cartesian plane
- Patternssearch for term
Regularities among distributions or regularities within a distribution. For sample distributions a pattern should be a characteristic that would still be present if another sample were taken from the same population or if the current sample were enlarged by sampling further individuals from the same population.
Curriculum achievement objectives references
Statistical investigation: Levels 3, 4, 5, (6), (7), (8)- Patterns and trends within and between data setssearch for term
With time series data, patterns and trends may be observable both within a time series data set and between two or more time series data sets. For example, there might be a clear relationship between average monthly temperature and monthly power consumption.
- Percentagesearch for term
A percentage is another way (as well as fractions and decimals) of representing a part of a whole. The term literally means ‘for each 100’. The percentage is therefore the numerator when the denominator is 100. For example, 1/2 = 50/100. So one half equals 50%. Similarly, 3/4 =75/100 so three quarters equals 75%. 29% = 29/100 Percentages can represent numbers greater than one. Just as 2/1 =2, so 200% = 2.
- Perimetersearch for term
The boundary of a figure in a plane is called the perimeter The length of the perimeter of a polygon is the sum of the lengths of its sides.
- Perimeters of circlessearch for term
The perimeter of a circle is called its circumference. The length of the perimeter of a circle is π x d, where d is the length of the diameter of the circle and π is an irrational number whose value is a little less than 3 1/7. Because π is an irrational number it has no exact representation as a fraction or decimal fraction. To five decimal places, π = 3.14159.
Students may discover this relationship between the length of the diameter of a circle and the length of its circumference by winding string around circles with a variety of diameters and thereby measuring the lengths of the circumferences. This is most easily gone using jars, tins, saucepans, etc. Dividing the length of the circumference by the length of the diameter shows that the ratio of circumference to diameter is constant and is independent of the size of the diameter.- Permutationsearch for term
A permutation of n different objects is an arrangement of the objects with attention given to the order of the arrangement. So, for example, the letters a, b, and c have six permutations. They are abc, acb, bac, bca, cab, and cba. The number of permutations of n objects taken n at a time is n factorial, which is written n! and is defined as follows:
n! = 1 x 2 x 3 x …x (n - 2)(n - 1)n
So 3! = 1 x 2 x 3 = 6- Perpendicularsearch for term
Two lines are perpendicular if the angle between them is 90o, that is, they meet at right angles. Two planes, P1 and P2 are perpendicular if a line in P1 perpendicular to the line of intersection of the planes P1 and P2 is perpendicular to every line in P2.
- Picture graphsearch for term
A graph for displaying the distribution of a category variable or a whole-number variable that uses pictures or symbols to represent the frequency of each category or value.
Picture graphs are useful for showing differences in frequency.
Example
A student collected data on the colour of cars that drove past her house and displayed the results on the picture graph below.
Alternative: pictogram, pictographCurriculum achievement objectives references
Statistical investigation: Levels (2), (3), (4), (5)- Pie graphsearch for term
There are two uses of pie graphs.
First, a graph for displaying the relative frequency distribution of a category variable in which a circle is divided into sectors, representing categories, so that the area of each sector represents the relative frequency of values in the category. See Example 1 below.
Second, a graph for displaying bivariate data; one category variable and one numerical variable. A circle is divided into sectors, representing categories, so that the area of each sector represents the value of the numerical variable for the sector as a percentage of the total value for all sectors. See Example 2 below.
For categories that do not have a natural ordering, it is desirable to order the categories from the largest sector area to the smallest.
Example 1
Students enrolled in an introductory Statistics course at the University of Auckland were asked to complete an online questionnaire. One of the questions asked them to enter their ethnicity. They chose from the following list: Chinese, Indian, Korean, Maori, New Zealand European, Other European, Pacific, Other. The 727 responses are displayed on the pie graph below.
Example 2World gold mine production for 2003 by country, based on official exports, is displayed on the bar graph below.
Alternatives: circle graph, pie chart
Curriculum achievement objectives references
Statistical investigation: Levels (4), (5), (6), (7), (8)- Placebosearch for term
A neutral treatment in an experiment that, to a person participating in the study, appears the same as the actual treatment.
Placebos are given to people in the control group so that reliable comparisons can be made between the treatment and control groups. People can experience positive outcomes from the psychological effect of believing they will improve because they have been given a treatment; the placebo effect. As part of determining whether a treatment is successful, researchers need to be able to see whether the effect due to a treatment is greater than the placebo effect.
Curriculum achievement objectives references
Statistical investigation: Levels (7), (8)
Statistical literacy: (Level 8)- Planesearch for term
A plane is a flat surface that is considered to extend indefinitely. It is a two-dimensional figure that has area but not volume. Hence a flat tabletop is a portion of a plane. The whole plane is the set of points defined by extending the tabletop indefinitely.
- Plane shape (or plane figure)search for term
A shape that can lie wholly in a plane. A plane shape is therefore a flat, two-dimensional shape and is imagined as having no volume.
- Point estimatesearch for term
A number calculated from a random sample that is used as an approximate value for a population parameter.
Example
A sample proportion, calculated from a random sample taken from a population, is a point estimate of the population proportion.
Alternative: estimate
See: interval estimate
Curriculum achievement objectives references
Statistical investigation: Levels (6), 7, (8)- Poisson distributionsearch for term
A family of theoretical distributions that is useful as a model for some discrete random variables. Each distribution in this family gives the probability of obtaining a specified number of occurrences of a phenomenon in a specified interval in time or space, under the following conditions:
• On average, the phenomenon occurs at a constant rate, λ
• Occurrences of the phenomenon are independent of each other
• Two occurrences of the phenomenon cannot happen at exactly the same time or in exactly the same place.Each member of this family of distributions is uniquely identified by specifying λ. As such, λ, is the parameter of the Poisson distribution and the distribution is sometimes written as Poisson(λ).
Let random variable X represent the number of occurrences of a phenomenon that satisfies the conditions stated above. The probability of x occurrences is calculated by:
P(X = x) =for x = 0, 1, 2, ...
Example
A graph of the probability function for the Poisson distribution with λ = 3 is shown below.
Curriculum achievement objectives reference
Probability: Level 8- Pollsearch for term
A systematic collection of data about opinions on issues taken by questioning a sample of people taken from a population in order to determine the opinion distribution of the population.
Alternative: opinion poll
See: survey
Curriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)
Statistical literacy: Levels 7, 8- Polygonsearch for term
A polygon is a portion of a plane bounded by straight lines. (See also Areas of polygons)
- Polyhedronsearch for term
A polyhedron (plural polyhedra) is a solid whose faces are all plane polygons. The faces need not be regular polygons. (A regular polygon is a polygon whose edges are all congruent and whose angles are all equal) If the faces of a polyhedron are all identical regular polygons then the polyhedron is referred to as a regular polyhedron, or a Platonic solid.
If the faces of a polyhedron are all regular polygons but are not identical, then the polyhedron is referred to as a semi-regular polyhedron, or an Archimedean solid.- Polyhedron netssearch for term
See Nets.
- Polynomialsearch for term
A polynomial equation of degree n is an equation of the form an xn + an-1 xn-1 + ... + a1 x + a0 = 0 where a0, a1, … an are all real numbers and all the powers of the variable x have nonnegative integer exponents.
For example, 3x+2=0 is a polynomial equation as is 2x5+√3x2=0, but 5x3+6x+3x-1=0 is not a polynomial because one of the exponents is negative.- Populationsearch for term
A collection of all objects or individuals of interest which have properties that someone wishes to record.
Sometimes a population is a collection of potential objects or individuals and, as such, does not physically exist but can be imagined to exist.
Example 1 (A real population)
All people aged 18 and over who were living in New Zealand on 8 November 2008.
Example 2 (An imagined population)
All possible 15-watt fluorescent light bulbs that could be produced by a manufacturing plant.
Curriculum achievement objectives references
Statistical investigation: Levels 6, 7, (8)- Population distributionsearch for term
The variation in the values of a variable if data has (or had) been obtained for every individual in the population.
Curriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)
Probability: Levels (5), (6), (7), (8)- Population meansearch for term
A measure of centre for a population distribution of a numerical variable. The population mean is the mean of all values of a numerical variable based on the collection of all objects or individuals of interest. It is the centre of mass of the values in a population distribution.
If the collection is finite then the population mean is obtained by adding all values in a set of values and then dividing this total by the number of values.
In many real situations the entire collection of values from a population is not available, for a variety of reasons. For example, the collection may be infinite or some objects or individuals may not be accessible. In such cases the value of the population mean is not known. The population mean may be estimated by taking a random sample of values from the population, calculating the sample mean and using this value as an estimate of the population mean.
The population mean is a number representing the centre of the population distribution and is therefore an example of a population parameter.
The Greek letter µ (mu) is the most common symbol for the population mean.
See: expected value (of a discrete random variable), mean, measure of centre
Curriculum achievement objectives references
Statistical investigation: Levels (6), (7), (8)- Population parameterssearch for term
A number representing a property of a population.
Common examples are the population mean, µ, the population proportion, π, and the population standard deviation, σ.
Population parameters, although fixed, are usually not known and are estimated by a statistic calculated from a random sample taken from the population. For example, a sample mean is an estimate of the population mean.
Curriculum achievement objectives references
Statistical investigation: Levels 7, (8)- Population proportionsearch for term
A part of a population with a particular attribute, expressed as a fraction, decimal or percentage of the whole population.
For a finite population, the population proportion is the number of members in the population with a particular attribute divided by the number of members in the population.
In many real situations the whole population is not available to be checked for the presence of an attribute. In such cases the value of the population proportion is not known. The population proportion may be estimated by taking a random sample from the population, calculating the sample proportion and using this value as an estimate of the population proportion.
The population proportion is a number representing a part of a population and is therefore an example of a population parameter.
The Greek letter π (pi) is a common symbol for the population proportion.
See: proportion
Curriculum achievement objectives references
Statistical investigation: Levels (7), (8)- Population standard deviationsearch for term
A measure of spread for a population distribution of a numerical variable that determines the degree to which the values differ from the population mean. If many values are close to the population mean then the population standard deviation is small and if many values are far from the population mean then the population standard deviation is large.
The square of the population standard deviation is equal to the population variance.
In many real situations the collection of all values from a population is not available, for a variety of reasons. For example, the collection may be infinite or some objects or individuals may not be accessible. In such cases the value of the population standard deviation is not known. The population standard deviation may be estimated by taking a random sample of values from the population, calculating the sample standard deviation and using this value as an estimate of the population standard deviation.
The population standard deviation is a number representing the spread of the population distribution and is therefore an example of a population parameter.
The Greek letter σ (sigma) is the most common symbol for the population standard deviation.
See: measure of spread, population variance, standard deviation, standard deviation (of a discrete random variable)
Curriculum achievement objectives references
Statistical investigation: Levels (7), (8)- Population variancesearch for term
A measure of spread for a population distribution of a numerical variable that determines the degree to which the values differ from the population mean. If many values are close to the population mean then the population variance is small and if many values are far from the population mean then the population variance is large.
The positive square root of the population variance is equal to the population standard deviation.
In many real situations the collection of all values from a population is not available, for a variety of reasons. For example, the collection may be infinite or some objects or individuals may not be accessible. In such cases the value of the population variance is not known. The population variance may be estimated by taking a random sample of values from the population, calculating the sample variance and using this value as an estimate of the population variance.
The population variance is a number representing the spread of a population and is therefore an example of a population parameter.
The square of the population standard deviation is equal to the population variance, so σ2 (sigma squared) is the most common symbol for the population variance.
See: measure of spread, variance, variance (of a discrete random variable)
Curriculum achievement objectives references
Statistical investigation: Levels (7), (8)- Positionsearch for term
Position may be described relative to many things. For example, latitude gives position relative to the equator. At Level One children might give their position relative to a nearby person or object using terms such as ‘behind’, ‘in front of’, ‘to the left of’, ‘to the right of’, ‘five steps away’ etc. At a more advanced level position could be described by coordinates on a map or on the Cartesian plane, or by latitude and longitude.
- Possible outcomessearch for term
When we perform a nondeterministic, or random, experiment we can consider the possible outcomes. (sometimes called the sample space). For example, when we roll a normal dice (or die) once, the possible outcomes are 1, 2, 3, 4, 5, or 6.
- Powerssearch for term
Consider the number 4 x 4 x 4 x 4 x 4 We can write this in the abbreviated form of 45. Just as 4+4+4+4+4 can be written as 5 x 4, so 4 x 4 x 4 x 4 x 4 can be written as 45. 45 is called a power. It is the fifth power of 4. It is usually read as "four to the fifth". For the power ap, a is called the base and p the exponent. So 4 is the base and 5 is the exponent of the power 45. The exponent is often loosely referred to as the power.
Rules for calculating with powers: (Note that these are all derivable from the definition of a power.)
am x an = am+n
From the definition of am and an we can see that
am x an = (a x a x …..x a)[m times] x (a x a x …..x a)[n times]
= (a x a x a x a …..x a)[m+n times]
= am+n (Example: 35 x 37 = 312)
Similar reasoning gives the following results:
am÷an = am-n (Example: 25÷22 = 23)
(am) n = amn (that is, am x n) (Example: (32)3 = 36)
If a ≠ 0 then a0 = 1 (Example: 100 =1 Note that100 equals, for example, 102÷102 which obviously equals 1.)
(ab)n = an bn (Example: (3 x 2)4 = 3 x 2 x 3 x 2 x 3 x 2 x 3 x 2 = 34 x 24)
a-n = 1÷an (= 1/an) (Example: 2-3 = 1/23 = 1/8 because, for example, 22÷25 = (2 x 2)÷(2 x 2 x 2 x 2 x 2) = 1÷(2 x 2 x 2) = 1/8)
But from above, 22÷25 = 22-5 = 2-3 So 2-3 = 1/23 = 1/8- Powers with fractional exponentssearch for term
See Roots.
- Precision (of an estimate)search for term
A measure of how close an estimate is expected to be to the true value of a population parameter. This measure is based on the degree of similarity among estimates of a population parameter, if the same sampling method were repeated over and over again.
Curriculum achievement objectives references
Statistical investigation: Levels (7), 8
Statistical literacy: (Level 8)- Predictionsearch for term
An assessment of the value of a variable at some future point of time (for time-series data) or an assessment of the value of one variable based on knowing the value of the other variable (for bivariate numerical data).
Curriculum achievement objectives references
Statistical investigation: Levels 7, 8- Prime numberssearch for term
A prime number is a natural number that has exactly two positive divisors, namely 1 and the number itself. So 2 is a prime because its only divisors are 1 and 2; 3 and 5 are primes for the same reason, that their only divisors are 1 and themselves. But 4 is not a prime because 1, 2, and 4 are all divisors of 4, so 4 has three divisors and is therefore not a prime. The first nine primes are 2, 3, 5, 7, 11, 13, 17, 19, and 23. There are infinitely many primes, that is, there is no ‘last’ prime number. Note that zero is not a prime number since every natural number divides zero, and one is not a prime number since it has only one divisor, namely itself.
The simplest means of sifting out the positive primes numbers from the natural numbers is to use the sieve of Eratosthenes.
Integers that are not 0, ±1, or prime, are called composite.- Prismsearch for term
A prism is a polyhedron with two congruent and parallel faces (called the bases) whose remaining faces (called the lateral faces) are parallelograms. So a prism is a portion of space enclosed by polygons, with specific properties. Prisms are named after the shape of their base faces. For example, if the bases are pentagons then the prism is a pentagonal prism.
If the lateral faces are all perpendicular to the base then the prism is called a right prism. Hence a cuboid is a right rectangular prism, since the base faces are both rectangular. The volume of a prism is the product of the area of a base polygon and the altitude (or vertical height) of the prism.- Probabilitysearch for term
A number that describes the likely occurrence of an event, measured on a scale from 0 (impossible event) to 1 (certain event).
Curriculum achievement objectives references
Statistical literacy: Level 6
Probability: Levels 4, 5, 6, 7, 8- Probability activitysearch for term
An activity that has a number of possible outcomes, none of which is certain to occur when a trial of the activity is performed.
Curriculum achievement objectives references
Statistical literacy: Levels 1, 2, 3, 4, 5
Probability: (All levels)- Probability density function (for a continuous random variable)search for term
A mathematical function that provides a model for the probability that a value of a continuous random variable lies within a particular interval. It is often useful to display this function as a graph, in which case this probability is the area between the graph of the function and the x-axis, bounded by the particular interval.
A probability density function has two further important properties:
1. Values of a probability density function are never negative for any value of the random variable.
2. The area under the graph of a probability density function is 1.The use of ‘density’ in this term relates to the height of the graph. The height of the probability density function represents how closely the values of the random variable are packed at places on the x-axis. At places on the x-axis where the values are closely packed (dense) the height is greater than at places where the values are not closely packed (sparse).
More formally, probability density represents the probability per unit interval on the x-axis.
ExampleLet X be a random variable with a normal distribution with a mean of 50 and a standard deviation of 15. The graph below shows the probability density function of X.
On the diagram below the shaded area equals the probability that X is between 15 and 30, i.e.,
P(15 < X < 30).
Curriculum achievement objectives references
Probability: Levels (7), (8)- Probability function (for a discrete random variable)search for term
A mathematical function that provides a model for the probability of each value of a discrete random variable occurring.
For a discrete random variable that has a finite number of possible values, the function is sometimes displayed as a table, listing the values of the random variable and their corresponding probabilities.
A probability function has two important properties:
1. For each value of the random variable, values of a probability function are never negative, nor greater than 1.
2. The sum of the values of a probability function, taken over all of the values of the random variable, is 1.Example 1
Let X be a random variable with a binomial distribution with n = 6 and π = 0.4. The probability function for random variable X is:
Probability of x successes in 6 trials, P(X = x) =
for x = 0, 1, 2, 3, 4, 5, 6
where
is the number of combinations of n objects taken x at a time.
A graph of this probability function is shown below.
Example 2Imagine a probability activity in which a fair die is rolled and the number facing upwards is recorded. Let random variable X represent the result of any roll.
The probability function for random variable X can be written as:
x 1 2 3 4 5 6 P(X = x) 1/6 1/6 1/6 1/6 1/6 1/6
Alternative: probability modelSee: model
Curriculum achievement objectives reference
Probability: (Level 8)- Probability of an eventsearch for term
Probability is the study of random events – events in which the outcome is not fixed. For example, if an experiment is to toss a fair coin twice and count the number of heads obtained, then an event could be that two heads occur. We can then ask, "What is the probability of that event occurring"? The probability of any event is a number between 0 and 1 inclusive. If the probability of an event is zero then the event is impossible and if the probability of an event is one then the event is certain. Hence the probability of an event can be described as a number that tells us how likely it is that the event will occur.
Some properties are:- Conditional probability. The conditional probability of A given B, written P(A|B), is the probability that A occurs given that B occurs and is equal to the probability of A and B both occurring divided by the probability of B. This can be written as P(A|B) = P(A∩B) / P(B)
- This can also be written as P(A∩B) = P(B). P(A|B) and is referred to as the general multiplication rule.
- If A and B are independent events, that is P(A|B) = P(A), then the above rule becomes P(A∩B) = P(A). P(B) and is referred to as the special multiplication rule.
- Productsearch for term
The product of two numbers is the result obtained when the numbers are multiplied together. For example, the product of 4 and 5 is 20 because 4 x 5 = 20.
- Properties of addition and subtraction with whole numberssearch for term
See Addition and Subtraction.
- Properties of multiplication and division with whole numberssearch for term
See Multiplication and Division.
- Proportionsearch for term
A part of a distribution with a particular attribute, expressed as a fraction, decimal or percentage of the whole distribution.
See: sample proportion
Curriculum achievement objectives references
Statistical investigation: Levels 5, (6), (7), 8- Proportionalitysearch for term
See Direct and indirect relationships with linear proportions.
- Pyramidsearch for term
A pyramid is a polyhedron whose base is a polygon and whose other faces are triangles with a common vertex. Pyramids are described in terms of the base polygon, for example, a triangular pyramid, hexagonal pyramid etc. A regular pyramid is a pyramid that has a regular polygon for a base and whose altitude meets the base at its centre. The volume of a pyramid is one third of the product of the base area and the vertical height. So for a pyramid, V = 1/3 x b x h where V is the volume, b is the area of the base and h is the vertical height.
- Pythagoras' Theoremsearch for term
The Theorem of Pythagoras states that in a right triangle, the square of the length of the hypotenuse is equal to the sum of the squares of the lengths of the two other sides.
Hence we can state the theorem as:
In a right triangle, the area of the square constructed on the hypotenuse is equal to the total area of the squares constructed on each of the other two sides.
This suggests a discovery approach to Pythagoras’ Theorem by drawing a right triangle and constructing squares on the sides of the triangle. The two smaller squares may then be cut out and shown to cover the area of the largest square.
There are many different proofs of Pythagoras Theorem. One of the simplest is Bhaskara’s proof.q- Quadratic equationssearch for term
Quadratic equations are equations of the form y=ax2+bx+c where a, b and c are real numbers. This is a polynomial equation of the second degree, because the greatest exponent of its powers is two. ax2 is the quadratic term, bx the linear term and c the constant term. When graphed on the Cartesian plane its graph forms a parabola. The simplest quadratic equations are the pure quadratic equations such as x2-4=0.
If x2-4=0 then x2=4 so x ∈ {2, -2}
Quadratic equations arise in many situations in the real world, such as in the following measurement problem:
A carpet is 6 metres longer than it is wide and has an area of 27 square metres. What are the dimensions of the carpet?
If the length of the carpet is x metres then the width is x-6 metres and we have the quadratic equation x(x-6)=27, which rearranges to x2-6x-27=0. There are three ways to solve this equation:- By factorising it:
x2-6x-27 = 0
So (x-9)(x+3) = 0 If the product of two numbers is zero then at least one of them is zero. So x-9 = 0 or x+3 = 0. So x = 9 or x = -3.
The carpet is 9 metres long by three metres wide. - By completing the square:
x2-6x-27 = 0
x2-6x = 27
We see that (x-3)2 = x2-6x+9
So x2-6x+9 = 27 +9
(x-3)2 = 36
x-3 = ± 6
x ∈ {9,-3} - By using the general solution, which is that if ax2 + bx +c = 0 then
x = [-b ± √(b2 - 4ac)]/2a
So for x2-6x-27 = 0 we have:
x = [6 ± √(36 +108)]/2
= (6 ± 12)/ 2
= 9, -3
- By factorising it:
- Quadrilateralsearch for term
A quadrilateral is a four-sided plane figure bounded by straight lines. Some specific quadrilaterals are the parallelogram, rectangle, square, rhombus and trapezium.
- Qualitative datasearch for term
Data in which the values can be organised into distinct groups. These distinct groups must be chosen so they do not overlap and so that every value belongs to one and only one group, and there should be no doubt as to which one.
Example
The eye colours of a class of Year 9 students.
Note: The Curriculum usage of category data is equivalent to qualitative data.
See: category data
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Qualitative variablesearch for term
A property that may have different values for different individuals and for which these values can be organised into distinct groups. These distinct groups must be chosen so they do not overlap and so that every value belongs to one and only one group, and there should be no doubt as to which one.
Example
The eye colours of a class of Year 9 students.
Note: The Curriculum usage of category variable is equivalent to qualitative variable.
See: category variable
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Quantitative datasearch for term
Data in which the values result from counting or measuring. Measurement data are quantitative, as are whole-number data.
Alternative: numerical data
See: measurement data, whole-number data
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Quantitative variablesearch for term
A property that may have different values for different individuals and for which these values result from counting or measuring. Measurement variables are quantitative, as are whole-number variables.
Alternative: numerical variable
See: measurement variable, whole-number variable
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Quarter turnsearch for term
A rotation through 90o. A quarter of a complete rotation.
- Quartilessearch for term
Numbers separating an ordered distribution into four groups, each containing (as closely as possible) equal numbers of values. The most common names for these three numbers, in order from lowest to highest, are lower quartile, median and upper quartile.
The lower quartile is a number that is a quarter of the way through the ordered distribution, from the lower end. The upper quartile is a number that is a quarter of the way through the ordered distribution, from the upper end.
There are several different methods for calculating quartiles. For reasonably small data sets it is recommended that the values are sorted into order (or displayed on a suitable graph) and then the median is calculated. This allows the distribution to be split into a ‘lower half’ and an ‘upper half’. The lower quartile is the median of the ‘lower half’ and the upper quartile is the median of the ‘upper half’. Use software for large data sets.
Note that different software may use different methods for calculating quartiles that may give different values for the quartiles. This is of no concern because in most cases any differences will be slight.
Example 1 (Odd number of values)
The maximum temperatures, in degrees Celsius (°C), in Rolleston for the first 9 days in November 2008 were: 18.6, 19.9, 20.6, 19.4, 17.8, 18.1, 17.8, 18.7, 19.6
Ordered values: 17.8, 17.8, 18.1, 18.6, 18.7, 19.4, 19.6, 19.9, 20.6
The median is 18.7°C.
The values in the ‘lower half’ are 17.8, 17.8, 18.1, 18.6. Their median is the mean of 17.8 and 18.1, which is 17.95. The lower quartile is 17.95°C.
The values in the ‘upper half’ are 19.4, 19.6, 19.9, 20.6. Their median is the mean of 19.6 and 19.9, which is 19.75. The upper quartile is 19.75°C.
The data and the quartiles are displayed on the dot plot below.
Notice that there are 2 values below the lower quartile, 2 values between the lower quartile and the median, 2 values between the median and the upper quartile and 2 values above the upper quartile.
Example 2 (Even number of values)The maximum temperatures, in degrees Celsius (°C), in Rolleston for the first 10 days in November 2008 were: 18.6, 19.9, 20.6, 19.4, 17.8, 18.1, 17.8, 18.7, 19.6, 18.8
Ordered values: 17.8, 17.8, 18.1, 18.6, 18.7, 18.8, 19.4, 19.6, 19.9, 20.6
The median is 18.75°C.
The values in the ‘lower half’ are 17.8, 17.8, 18.1, 18.6, 18.7. Their median is 18.1. The lower quartile is 18.1°C.
The values in the ‘upper half’ are 18.8, 19.4, 19.6, 19.9, 20.6. Their median is 19.6. The upper quartile is 19.6°C.
The data and the quartiles are displayed on the dot plot below.
Notice that there are 2 values below the lower quartile, 2 values between the lower quartile and the median, 2 values between the median and the upper quartile and 2 values above the upper quartile.See: median
Curriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)r- Random samplesearch for term
A sample in which all objects or individuals in the population have the same probability of being chosen in the sample.
A random sample can also be a number of independent values from the same theoretical distribution, without involving a real population.
See: simple random sample
Curriculum achievement objectives references
Statistical investigation: Levels 6, 7, (8)- Random samplingsearch for term
The process of selecting a random sample.
Curriculum achievement objectives references
Statistical investigation: Levels 6, 7, (8)- Random variablesearch for term
A property that can have different values because there is an element of chance involved in obtaining any value for the property.
A random variable is often represented by an upper case letter, say X.
Example 1
The random selection of an individual from a population is subject to chance. The height of a selected individual will depend on the individual selected and is therefore a random variable. This may be written as, let X represent the height of a randomly selected individual.
Example 2
The random selection of 10 individuals from a population is subject to chance. The number of left-handed people in a sample of 10 individuals will depend on the individuals selected and is therefore a random variable. This may be written as, let X represent the number of left-handed people in a random selection of 10 individuals.
See: continuous random variable, discrete random variable
Curriculum achievement objectives references
Probability: Levels (7), 8- Randomisationsearch for term
The use of methods involving elements of chance, such as random numbers, to allocate individual units to groups.
Randomisation used in data collection
Randomisation is used in experiments by using methods involving elements of chance to allocate individual units to treatment groups. Further detail is provided in the paragraph on randomisation in the description of experimental design principles.
Randomisation forms the basis of many sampling methods, including random sampling, simple random sampling, cluster sampling and stratified sampling.
Randomisation used in statistical inference
Randomisation is used at Level Eight in a resampling method for making statistical inferences from data. This method is illustrated in the following two examples that compare two means. A summary of the method is provided after these two examples.
Example 1
An assertion is made that male University of Auckland students tend to reach faster driving speeds than female students. To investigate this, random samples of 20 male and 20 female University of Auckland students were asked how fast they had driven, to the nearest 10km/h.
The values obtained were:
Males: 130 120 120 140 140 120 120 120 170 160 110 150 210 240 200 140 150 200 240 140
Females: 100 170 140 120 120 120 120 90 120 100 130 120 120 130 120 120 110 100 130 120The sample means are
= 156.0 km/h for the males and
= 120.0 km/h for the females. From this data, an estimate of the difference between the population mean fastest speeds for males and females is
-
= 36.0 km/h.
A dot plot of the data is shown below.
Does the data provide any evidence to support the assertion? In other words, could a difference as big as the observed difference of 36.0 be produced just by chance?If the numbers are considered as just showing the natural variability in fastest driving speeds among such university students, then would random allocation of the speeds from these two samples to the male and female groups often produce a difference in sample means as big as 36.0? If random allocation alone could easily produce a difference as big as 36.0 then the data cannot be interpreted as support that the mean fastest driving speed for males is greater than the mean for females.
The speeds from the two samples are combined and 20 of them are randomly allocated as speeds for the males, leaving the other 20 as speeds for the females. This is equivalent to assuming there is no link between fastest driving speed and gender. The difference in the sample means is an estimate produced by sampling variation alone. One such randomisation is shown below.
The sample means are
= 139.0 km/h and
= 136.5 km/h, with
-
= 3.0 km/h.
Another random allocation of 20 (of the 40) as speeds for the males, leaving the other 20 as speeds for the females is shown below.
The sample means are= 132.5 km/h and
= 143.5 km/h, with
-
= -11.0 km/h.
Continuing this process for a total of 100 such random allocations produced differences in sample means shown in the dot plot below.
Of the 100 differences produced by sampling variation alone, none was as large as the observed difference of 36.0 km/h produced by the two samples. This shows that a difference of 36.0 km/h or larger is very unlikely to be produced by sampling variation alone when there is no link between fastest driving speed and gender. It can be concluded that the data provide very strong evidence that the mean fastest driving speed for male University of Auckland students is greater than that for females.Note that the assertion was that the mean fastest driving speed for males is greater than that for females and so only positive differences of 36.0 or more were considered when forming the conclusion.
Example 2
Question: Is there a difference in the average daily number of text messages sent by male and female University of Auckland students? To investigate this, random samples of 20 male and 20 female University of Auckland students were asked how many text messages they typically sent in a day.
The values obtained were:
Males: 40 10 30 20 5 0 1 30 30 10 30 3 6 50 20 30 20 50 10 30
Females: 20 2 50 30 15 0 6 60 10 5 100 15 40 3 30 15 100 5 5 50The sample means are
= 21.25 messages per day for the males and
= 28.05 messages per day for the females. From this data, an estimate of the difference between the population mean daily number of text messages for males and females is
-
= -6.80 = messages per day.
A dot plot of the data is shown below.
Does the data provide any evidence of a difference in the average daily number of text messages sent by male and female students? In other words, could a difference as big as the observed difference of –6.80 be produced just by chance?If the numbers are considered as just showing the natural variability in daily numbers of text messages sent among such university students, then would random allocation of the number of messages from these two samples to the male and female groups often produce a difference as big as –6.80? If random allocation alone could easily produce a difference in sample means as big as
–6.80 then the data cannot be interpreted as support that the mean daily number of text messages sent is different for males and females.The numbers of messages from the two samples are combined and 20 of them are randomly allocated as daily numbers of text messages sent for the males, leaving the other 20 as daily numbers of text messages sent for the females. This is equivalent to assuming there is no link between daily number of text messages sent and gender. The difference in the sample means is an estimate produced by sampling variation alone. One such randomisation is shown below.
The sample means are
= 31.45 and
= 17.85, with
-
= 13.60.
Another random allocation of 20 (of the 40) as daily numbers of text messages sent for the males, leaving the other 20 as daily numbers of text messages sent for the females is shown below.
The sample means are
= 33.80 and
= 15.50, with
-
= 18.30.
Continuing this process for a total of 100 such random allocations produced differences in sample means shown in the dot plot below.
Of the 100 differences produced by sampling variation alone, 52 (52%) were at least as far from zero as the difference of –6.80 produced by the two samples. This shows that a difference of –6.80 is a typical value produced by sampling variation alone when there is no link between daily number of text messages sent and gender. It can be concluded that the data provide no evidence that the mean daily number of text messages sent by male University of Auckland students is different from that for female students.Note that the question was about a difference between the means for males and females and so positive and negative differences that are at least 6.80 from zero were considered when forming the conclusion.
Note: The principles explained in the above examples can also be applied to a difference between two proportions for category variables and to a slope of a fitted regression line for bivariate measurement variables.
Summary
Data are collected to investigate an assertion or a question, usually involving a comparison of a numerical variable between two categories of a category variable (i.e., that there is a link between the numerical variable and the category variable). An estimate of a population parameter is calculated from the data. This observed estimate is often a difference between means (as in the two examples) but could be a difference between two proportions or a slope of a fitted regression line.
Could an estimate as big as the observed estimate be produced just by chance?
To answer this question, the effect of sampling variation alone on the estimate needs to be considered when it is assumed that there is no link between the two variables. If random allocation alone could easily produce an estimate as big as the observed estimate then the data cannot be interpreted as support for the existence of a link between the two variables. Values of the numerical variable obtained from the data collection are randomly allocated to the two categories of the category variable. An estimate is calculated from this ‘resampling using randomisation’ process. This process is repeated many times to form a distribution of estimates under sampling variation alone.
By comparing the observed estimate with the distribution of estimates, an assessment can be made of the strength of evidence the data provide for the assertion or provide for a conclusion to the question. This assessment is made by looking at the percentage of estimates under sampling variation alone that are at least as far from zero as the observed estimate.
Why are estimates that are at least as far from zero as the observed estimate considered?If the observed estimate is a typical value of an estimate produced by sampling variation alone then it is quite believable there is no link between the variables. In this case, a relatively large percentage of estimates produced by sampling variation alone will be at least as far from zero as the observed estimate.
If the observed estimate is not a typical value of an estimate produced by sampling variation alone then it is difficult to believe there is no link between the variables. In this case, a relatively small percentage of estimates produced by sampling variation alone will be at least as far from zero as the observed estimate.
This makes the percentage of estimates produced by sampling variation alone that are at least as far from zero as the observed estimate an appropriate measure of the strength of evidence that there is a link between the two variables, with smaller percentages providing stronger evidence of a link.
Estimates produced by sampling variation alone are usually values of a continuous random variable (but the values are rounded). Because of the continuous nature of the values of these estimates, the percentage of estimates that are the same distance from zero as the observed estimate will almost always be small, causing this to be an inappropriate measure of the strength of evidence of a link.
See: resampling, strength of evidence
Curriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), 8- Rangesearch for term
A measure of spread for a distribution of a numerical variable that is calculated as the difference between the largest and smallest values in the distribution.
The range is less useful than other measures of spread because it is strongly influenced by the presence of just one unusually large or small value; hence the range conveys only one aspect of the spread of the distribution. It is recommended that a graph of the distribution is used to check the appropriateness of the range as a measure of spread and to emphasise its meaning as a feature of the distribution.
Example
The maximum temperatures, in degrees Celsius (°C), in Rolleston for the first 10 days in November 2008 were: 18.6, 19.9, 20.6, 19.4, 17.8, 18.1, 17.8, 18.7, 19.6, 18.8
The largest value is 20.6°C and the smallest is 17.8°C.
The range of the maximum temperatures over these 10 days is 20.6°C – 17.8°C = 2.8°C
The data and the range are displayed on the dot plot below.

See: measure of spreadCurriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)- Ratesearch for term
A rate is a comparison of two different types of quantity or attribute. For example, 6 km / hour is a rate because it involves a comparison of distance with time. $5-00 per metre is a rate because it involves the comparison of money and length. This is close to the concept of ratio, which usually is used to make comparisons within the same attribute. A unit rate is a rate that is simplified so that it gives a measure of the first attribute for each unit of the second attribute. For example, if New Zealand’s 4,000,000 people are represented by 120 members of parliament, then that is a unit rate of approximately 33,333 people per MP.
$5-00 per metre is usually written as $5-00/m which emphasises the fact that a rate is a fraction.- Rate of change and graphssearch for term
The gradient of a straight-line graph is a measure of the slope of the line. Usually given the letter m, it is found by taking any two points on the line and putting
m= vertical distance between the points horizontal distance between the points
m can also be defined as the tangent of the angle θ where θ is the smallest non-negative angle that the line makes with the positive end of the x-axis.
The gradient of a curve at a point P is the gradient of the tangent to the curve at P. The gradient at a point P is a measure of the rate of change of the variable on the vertical axis (usually the y-axis) with respect to the variable on the horizontal axis (usually the x-axis). For example, if the gradient is 3 then the vertical variable is increasing by 3 units for every 1 unit that the horizontal variable increases.
- Ratiosearch for term
A ratio is an expression of the relationship between two measures of the same attribute. Usually written as a:b it expresses how much of a and b are consistently combined in a whole, e.g. the ratio of weedkiller to water is 1:100, or the relationship of a to b, e.g. the scale on a map is 1 cm:100m. Common use distinguishes rates, which have two measurements of different units, e.g. kilometres per hour, and ratios, which have two or more measures of the same attribute. Both imply a division, although a ratio is not usually expressed in the form of a decimal fraction. As an example, suppose 15 lollies are shared between Molly and Dolly in the ratio 2:3. That means that for every 2 lollies that Molly receives, Dolly receives 3. So Molly gets 2 lollies out of every 5 (or 2/5 of the 15 lollies) and Dolly gets 3 lollies out of every 5 (or 3/5 of the 15 lollies).
In situations in which the ratio describes composition of a whole four fractional relationships exist. For example, in the ratio a:b, a/(a+b) gives the fraction of the whole made up by a. Similarly, b/(a+b) gives the fraction of the whole made up by b. So for a bag containing jelly beans in the ratio 3 black:5 red, 3/8 of the jellybeans are black and 5/8 are red.
In the ratio a:b, a/b describes the multiplicative relationship between the amount of a and the amount of b. Similarly this is b/a for the multiplicative relationship b to a. So for the jellybean example above, there are 3/5 as many black jellybeans as red, and there are 5/3 as many red jellybeans as black.- Rational algebraic expressionssearch for term
A rational function is a function of the form:
f(x)/g(x) where f(x) and g(x) are algebraic expressions.
x/(x-1) is a rational algebraic expression.
x/(x-1)=2 is a rational algebraic equation. Multiplying both sides of the equation by x-1 gives x=2x-2 so x=2.
Rational algebraic expressions should be manipulated in the same way that rational numbers are manipulated.
For example, when adding or subtracting, a common denominator needs to be found as shown:
5 + 3 = 5(x-2) + 3(x+1) x+1 x-2 (x+1)(x-2) (x+1)(x-2) = 8x-7 x2-x-2
Care is needed to ensure that the denominator is not zero, since if it is zero the expression becomes meaningless.- Rational numberssearch for term
The rational numbers, often given the label Q, are the numbers that can be written as fractions, that is in the form a/b where a and b are integers and b ≠ 0. (b cannot be equal to zero because division by zero is meaningless). The decimal form of all rational numbers is a repeating or a terminating decimal and all repeating or terminating decimals are rational numbers and can be written in the form a/b.
- Re-categorising datasearch for term
The redefining of a variable in some way or the derivation of a new variable from one or more existing variables.
Example 1 (Redefining categories of a category variable)
Consider this question from a questionnaire:
From the given list of types of movies, select the one type that you like best.
The data could initially be classified into the listed categories. If some categories had a relatively low frequency then it would be appropriate to re-categorise the data by combining some categories of a similar nature. This is called aggregation. For example; Horror, Mystery and Thriller could be aggregated to form a ‘Suspense’ category. Alternatively, if the ‘Other’ category had a relative high frequency then the specified responses may suggest some additional categories which could then be used to re-categorise the data.
Example 2 (Expressing the values of a numerical variable in a simpler way)
The values of a variable ‘Time’ could initially be recorded as the time from a stop-watch, say 2h 4m 32.4s. For explanation and analysis all values need to be converted to the time in seconds; 7472.4s for the value above.
Example 3 (Deriving new variables from an existing variable)From a variable ‘Date of Birth’ several new variables could be formed, such as ‘Age in Completed Years’, ‘Year of Birth’, ‘Day of Birth’, ‘Month of Birth’ or ‘Star Sign’.
Example 4 (Deriving a new variable from existing variables)
From the variables ‘Height’ and ‘Weight’ a new variable ‘Body Mass Index’ can be formed by calculating Weight/Height2, provided that weight is recorded in kilograms and height is recorded in metres.
Example 5 (Deriving a new variable from existing variables)
From the variables ‘Total Weekly Leisure Time’ and ‘Weekly Time Playing Sport’ a new variable ‘Percentage Sport Time’ can be formed by Weekly Time Playing Sport/Total Weekly Leisure Time × 100.
Curriculum achievement objectives references
Statistical investigation: Levels 5, (6), (7), (8)- Real numberssearch for term
The set of real numbers is the union of the set of rational numbers and the set of irrational numbers. Hence it is all the numbers on the number line.
- Reciprocalsearch for term
The reciprocal of a number a is its multiplicative inverse, that is, the number which when multiplied by a gives 1 as the answer. (1 is the multiplicative identity element because multiplication of any number by 1 leaves the number unchanged.) So the reciprocal of 2 is 1/2, since 2 x 1/2 = 1
The reciprocal of 2/3 is 3/2 since 2/3 x 3/2 = 1- Rectanglesearch for term
A rectangle is a four-sided polygon with opposite sides equal in length and all interior angles right angles (that is 90o). The area of a rectangle is the product of two adjacent sides. For example, if the sides of a rectangle are 8 cm and 6 cm then the area is 48 cm2. The area of a rectangle can be effectively explored using grid paper.
- Reflectionsearch for term
A reflection in the plane has the effect of transforming an object in the plane onto its mirror image. Thus under reflection in the plane a figure in the plane is effectively flipped over a fixed line in the plane. The points on this line (called the mirror line or line of reflection) are the only fixed (invariant) points of the transformation. A reflection in space has a plane of reflection, that is, a plane of points that are invariant under the transformation of reflection. A reflection is a shape-preserving (isometric) transformation.
- Regression linesearch for term
A line that summarises the linear relationship (or linear trend) between the two variables in a linear regression analysis, from the bivariate data collected.
A regression line is an estimate of the line that describes the true, but unknown, linear relationship between the two variables. The equation of the regression line is used to predict (or estimate) the value of the response variable from a given value of the explanatory variable.
Example
The actual weights and self-perceived ideal weights of a random sample of 40 female students enrolled in an introductory Statistics course at the University of Auckland are displayed on the scatter plot below. A regression line has been drawn. The equation of the regression line is
predicted y = 0.6089x + 18.661 or predicted ideal weight = 0.6089 × actual weight + 18.661
Alternatives: fitted line, line of best fit, trend lineSee: least-squares regression line
Curriculum achievement objectives reference
Statistical investigation: (Level 8)
Statistical investigation: Levels 5, (6), (7), (8)- Relating three-dimensional models to two-dimensional representations and vice-versasearch for term
A three dimensional model or shape can be represented on a two dimensional surface (such as a piece of paper) by drawing cross-sectional views of the intersection of the shape with three planes that are perpendicular to each other. For example, for a building, we could take a front cross-sectional view, a side cross-sectional view and a so-called ‘bird’s-eye view’.
A model can also be represented by an isometric drawing or, in the case of a polyhedron, a net.- Relationshipsearch for term
A connection between two variables, usually two numerical variables. Such a connection may not be evident until the data are displayed. A relationship between two variables is said to exist if the connection evident in a data display is so strong that it could not be explained as only due to chance.
Example 1 (Two numerical variables)
The actual weights and self-perceived ideal weights of a random sample of 40 female students enrolled in an introductory Statistics course at the University of Auckland are displayed on the scatter plot below (left). In general, as the values of actual weight increase the values of ideal weight increase. There is clearly a relationship between the variables actual weight and ideal weight.
The actual weights and number of countries visited (other than New Zealand) of a random sample of 40 male students enrolled in an introductory Statistics course at the University of Auckland are displayed on the scatter plot below (right). There is no clear connection between the variables actual weight and number of countries visited (other than New Zealand).

Example 2 (One numerical variable and one category variable)The actual weights of random samples of 40 male and 40 female students enrolled in an introductory Statistics course at the University of Auckland are displayed on the dot plot below (left). On average, the actual weight of males is greater than that of females. There is clearly a relationship between the variables actual weight and gender.
The number of countries visited (other than New Zealand) by random samples of 40 male and 40 female students enrolled in an introductory Statistics course at the University of Auckland are displayed on the dot plot below (right). The two sample distributions are quite similar indicating that there is no clear connection between the variables number of countries visited (other than New Zealand) and gender.
Example 3 (Two category variables)The two sets of bar graphs below display data collected from a random sample of students studying an introductory Statistics course at the University of Auckland. They are enrolled in one of 3 courses; STATS 101, STATS 102 or STATS 108.
The proportions of each ethnic group in each course are displayed on the bar graphs on the left. The three distributions are sufficiently different to indicate that there is a relationship between the variables ethnicity and course.
The proportions of each ethnic group for males and females are displayed on the bar graphs on the right. The two distributions are quite similar indicating that there is no clear connection between the variables ethnicity and gender.
See: associationCurriculum achievement objectives references
Statistical investigation: Levels 4, 5, 6, (7), (8)- Relationships between successive elements of number patternssearch for term
Using a table and looking at the differences between successive terms can be an effective way of finding a rule that will generate all the terms of a sequential pattern.
Example: Consider the pattern: 1, 4, 7, 10, 13, 16, 19,…
Compare the pattern with the multiples of three: 3, 6, 9, 12, 15, 18, 21, …n 1 2 3 4 5 6 7 nth term 1 4 7 10 13 16 19 Differences 3 3 3 3 3 3
It is apparent form the table that the nth term, Tn, is given by the expression:
Tn = (3xn)-2
Similarly, a graph might also display the relationship.- Relative frequencysearch for term
For a whole-number variable in a data set, the number of times a value occurs divided by the total number of observations.
For a measurement variable in a data set, the number of occurrences in a class interval divided by the total number of observations.
For a category variable in a data set, the number of occurrences in a category divided by the total number of observations.
In other words, relative frequency = frequency / number of observations
See: frequency
Curriculum achievement objectives references
Statistical investigation: Levels (4), (5), (6), (7), (8)- Relative risksearch for term
The ratio of the risk (or probability) of an event for one group to the risk of the same event for a second group.
Example
The following data were collected on a random sample of students enrolled in a Statistics course at the University of Auckland.
Attendance Regular Not regular Total Course
resultPass
Fail83
1719
27102
44Total 100 46 146
The risk of failing for students with non-regular attendance = 27/46 = 0.5870The risk of failing for students with regular attendance = 17/100 = 0.17
The relative risk of failing for students with non-regular attendance compared to those with regular attendance = 0.5870/0.17 = 3.5
This can be interpreted as the risk of failing for students with non-regular attendance is about 3.5 times the risk of failing for students with regular attendance.
Curriculum achievement objectives references
Probability: Levels 7, (8)- Relative size of positive and negative integers and decimalssearch for term
Understanding the relative size of numbers in a place value system involves understanding the place value concept, and the face values of the numeral symbols (See. Base ten numeration system). Examples: 99 < 103 because 103 has a 1 in the hundreds column and 99 has no numeral in the hundreds column. 346 < 348 because although they have the same numerals in the hundreds and the tens columns, in the ones column they have a 6 and an 8 and 6 < 8. Some caution is needed with negative numbers as in the following example. -28 < -17 because although 28 > 17, -28 is to the left of -17 on the number line. A number line can help children with the order relation. Decimals are ordered in the same way that whole numbers are but some caution is needed as children will often see, for example, 0.32 to be greater than 0.8 since 32 is greater than 8. Using the number line or modelling decimals with materials will help to overcome this misconception.
- Repeating patternsearch for term
A repeating pattern is a pattern that consists of a core that is repeated. The core is the shortest string of elements that repeats. E.g. 1, 2, 3, 1, 2, 3, 1, 2, 3, … (The core is 1, 2, 3 ) or a,b,a,b,a,b,a,b,…(The core is ab).
- Resamplingsearch for term
A technique in which samples are taken repeatedly from an existing sample or existing samples.
Resampling using randomisation is a method used at Level Eight. Two examples of this method are provided in the description of randomisation and a summary of the method is given in the paragraphs below this paragraph.
Data are collected to investigate an assertion or a question, usually involving a comparison of a numerical variable between two categories of a category variable (i.e., that there is a link between the numerical variable and the category variable). An estimate of a population parameter is calculated from the data. This observed estimate is often a difference between means but could be a difference between two proportions or a slope of a fitted regression line.
Could an estimate as big as the observed estimate be produced just by chance?
To answer this question, the effect of sampling variation alone on the estimate needs to be considered when it is assumed that there is no link between the two variables. If random allocation alone could easily produce an estimate as big as the observed estimate then the data cannot be interpreted as support for the existence of a link between the two variables. Values of the numerical variable obtained from the data collection are randomly allocated to the two categories of the category variable. An estimate is calculated from this ‘resampling using randomisation’ process. This process is repeated many times to form a distribution of estimates under sampling variation alone.
By comparing the observed estimate with the distribution of estimates, an assessment can be made of the strength of evidence the data provide for the assertion or provide for a conclusion to the question. This assessment is made by looking at the percentage of estimates under sampling variation alone that are at least as far from zero as the observed estimate.
See: randomisation, strength of evidence
Curriculum achievement objectives reference
Statistical investigation: Level 8- Residual (in linear regression)search for term
The difference between an observed value of the response variable and the value of the response variable predicted from the regression line.
From bivariate data to be used for a linear regression analysis, consider one observation,(xi, yi). For this value of the explanatory variable, xi, the value of the response variable predicted from the regression line is yi, giving a point (xi, yi) that is on the regression line. The residual for the observation (xi, yi) is yi - yi.
Example
The actual weights and self-perceived ideal weights of a random sample of 40 female university students enrolled in an introductory Statistics course at the University of Auckland are displayed on the scatter plot below. A regression line has been drawn. The equation of the regression line is
predicted y = 0.6089x + 18.661 or predicted ideal weight = 0.6089 × actual weight + 18.661Consider the female whose actual weight is 72kg and whose self-perceived ideal weight is 70kg.
Her predicted ideal weight is 0.6089 × 72 + 18.661 = 62.5kg
The residual for this observation is 70kg – 62.5kg = 7.5kg
This is also displayed on the scatter plot.
Alternative: prediction errorCurriculum achievement objectives reference
Statistical investigation: (Level 8)- Response variablesearch for term
The variable, of the two variables in bivariate data, which may be affected by the other variable, the explanatory variable.
If the bivariate data result from an experiment then the response variable is the one that is observed in response to the experimenter having manipulated or selected the value of the explanatory variable.
In a scatter plot, as part of a linear regression analysis, the response variable is placed on the y-axis (vertical axis).
Alternatives: dependent variable, outcome variable, output variable
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Risksearch for term
An alternative name for probability. The risk of an event occurring is the probability of the event occurring and is mainly used when the event is related to a health issue or is an undesirable event.
Example
The following data were collected on a random sample of students enrolled in a Statistics course at the University of Auckland.
Attendance Regular Not regular Total Course
resultPass
Fail83
1719
27102
44Total 100 46 146
Based on this sample of students, the risk of failing = 44/146 = 0.30The risk of failing for students with regular attendance = 17/100 = 0.17
Curriculum achievement objectives references
Probability: Levels 7, (8)- Rootssearch for term
a1/n where n is a positive integer, is defined as the nth root of a.
Consider the meaning of a1/2 x a1/2.
By the laws of powers, a1/2 x a1/2 = a1/2+1/2 = a1 = a
So a1/2 is the number which when multiplied by itself gives a. We usually refer to this as the square root of a and write it as √a. So a1/2 = √a. So √4 = 2, since 2x2=4. Two is referred to as the square root of 4. It is the principal square root. There is another number which when multiplied by itself gives an answer of 4 and that is -2 since -2 x -2 = 4. (See Multiplication: Integers) -2 is referred to as a secondary root.
a1/3 x a1/3 x a1/3 = a1/3+1/3+1/3 = a1 = a, so a1/3 is the number which when multiplied by itself three times gives an answer of a. We usually refer to a1/3 as the cube root of a and write it as 3√a. So a1/3 = 3√a
3√8 = 2 since 2 x 2 x 2 = 8. Two is the principal cube root of 8. There are no real secondary roots.
3√ -8 = -2 because -2 x -2 x -2 = -8. -2 is called the principal root of -8 because there are no other real roots.
√ -4 has no principal root since there are no real numbers which when multiplied by themselves give a negative number.
The fourth root of a is a1/ which can be written as 4√a etc.
Note the connection with geometry for square root and cube root. The square root of 9 is 3, which is the length of the side of a square of area 9 square units. The cube root of 27 is 3, which is the length of an edge of a cube of volume 27 cubic units.
The square roots of numbers that are not themselves squares of natural numbers (that is they do not belong to the sequence 1, 4, 9, 16, 25, 36, …)are irrational numbers and have neither an exact fractional representation nor a terminating or repeating decimal representation. For example, √2 as a decimal to 20 places is 1.414213562373095048801688724209.- Rotationsearch for term
A rotation in the plane is a movement in a circular motion (a turn) through some angle that leaves shape unchanged and in which exactly one point (the centre of rotation) does not move i.e. a rotation in the plane has one invariant point. A rotation of a three-dimensional figure in space has a line of invariant points called the axis of rotation.
Rotation, like translation and reflection, is called an isometric transformation since it does not change the shape of the figure being rotated.- Roundingsearch for term
Rounding of a number means replacing the numeral by another numeral that has fewer significant figures. The decision whether to round up or down depends on the value of the leading digit being rounded off. The convention is that 0, 1, 2, 3, and 4 are effectively just chopped off (truncated) while 5, 6, 7, 8, and 9 are truncated but the first digit not truncated is increased by a value of one.
Suppose that, measuring to the nearest millimetre we recorded some measurements as (in metres) 52.365, 12.764, 4.986, 2.031, and 5.699. If we decided to change them to measurements to the nearest centimet
re (that is, to two decimal places (2dp)) we could record them as 52.37, 12.76, 4.99, 2.03, and 5.70.
Using a metre ruler graduated in millimetres can help with an understanding of these roundings. The numbers 52.365 etc. above are expressed to three decimal places (3dp) while the numbers 52.37 etc above are expressed to two decimal places (2dp).s- Samplesearch for term
A group of objects, individuals or values selected from a population. The intention is for this sample to provide estimates of population parameters.
See: cluster sampling, random sample, simple random sample, stratified sampling, systematic sampling
Curriculum achievement objectives references
Statistical investigation: Levels 5, (6), (7), (8)
Probability: Levels 6, (7)- Sample distributionsearch for term
The variation in the values of a variable in data obtained from a sample.
For whole-number data, a sample distribution is often displayed, in a table, as a set of values and their corresponding frequencies, or on an appropriate graph.
For measurement data, a sample distribution is often displayed, in a table, as a set of intervals of values (class intervals) and their corresponding frequencies, or on an appropriate graph.
For category data, a sample distribution is often displayed, in a table, as a set of categories and their corresponding frequencies, or on an appropriate graph.
Alternative: empirical distribution
See: experimental distribution
Curriculum achievement objectives references
Statistical investigation: Levels 5, (6), (7), (8)- Sample meansearch for term
A measure of centre for the distribution of a sample of numerical values. The sample mean is the centre of mass of the values in their distribution.
If the n values in a sample are x1, x2, ... , xn, then the sample mean is calculated by adding the values in the sample are then dividing this total by the number of values. In symbols, the sample mean,
, is calculated by
.
For large samples it is recommended that a calculator or software is used to calculate the mean.
The sample mean is a (sample) statistic and is therefore an estimate of the population mean.See: mean
Curriculum achievement objectives references
Statistical investigation: Levels (6), (7), (8)- Sample proportionsearch for term
A part of a sample with a particular attribute, expressed as a fraction, decimal or percentage of the whole sample.
A common symbol for the sample proportion is p.
Example
Suppose the attribute of interest was left-handedness and that a random sample of 10 people contained 3 left-handed people.
The sample proportion is 3/10 or 0.3 or 30%.
See: proportion
Curriculum achievement objectives references
Statistical investigation: Levels (6), (7), 8- Sample sizesearch for term
The number of objects, individuals or values in a sample.
Typically, a larger sample size leads to an increase in the precision of a statistic as an estimate of a population parameter.
The most common symbol for sample size is n.
Curriculum achievement objectives references
Statistical investigation: Levels (5), (6), 7, (8)
Probability: Levels 6, (7)- Sample spacesearch for term
The set of all of the possible outcomes for a probability activity or a situation involving an element of chance.
For discrete situations the sample space can be listed.
Note that a sample space can often be described in several different ways.
Example 1
In a situation where a person will be selected and their eye colour recorded, a sample space is blue, grey, green, hazel, brown. Each person’s eye colour must belong to exactly one of these categories.
Example 2
In a situation where the gender of the child is recorded in birth order, a sample space is: (BBB, BBG, BGB, BGG, GBB, GBG, GGB, GGG).
A different sample space could be: 3 boys, exactly 2 boys, exactly 1 boy, no boys.
A different sample space again could be: more boys than girls, more girls than boys.
Curriculum achievement objectives references
Probability: Levels (4), (5), (6), (7), 8- Sample standard deviationsearch for term
A measure of spread for a distribution of a sample of numerical values that determines the degree to which the values differ from the sample mean.
It is calculated by taking the square root of the average of the squares of the deviations of the values from their sample mean.
It is recommended that a calculator or software is used to calculate the sample standard deviation.
The square of the sample standard deviation is equal to the sample variance.A common symbol for the sample standard deviation is s.
The sample standard deviation is a (sample) statistic and is therefore an estimate of the population standard deviation.
See: measure of spread, sample variance, standard deviation
Curriculum achievement objectives references
Statistical investigation: Levels (6), (7), (8)- Sample statisticsearch for term
A number that is calculated from a sample of numerical values.
A sample statistic gives an estimate of the corresponding value from the population from which the sample was taken. For example, a sample mean is an estimate of the population mean.
See: statistic
Curriculum achievement objectives references
Statistical investigation: Levels (6), 7, (8)- Sample statisticssearch for term
Numbers calculated from a sample of numerical values that are used to summarise the sample. The statistics will usually include at least one measure of centre and at least one measure of spread.
Alternative: numerical summary
See: descriptive statistics, summary statistics
Curriculum achievement objectives references
Statistical investigation: Levels (6), (7), (8)- Sample variancesearch for term
A measure of spread for a distribution of a sample of numerical values that determines the degree to which the values differ from the sample mean.
It is calculated by the average of the squares of the deviations of the values from their sample mean.
The positive square root of the sample variance is equal to the sample standard deviation.It is recommended that a calculator or software is used to calculate the sample variance. On a calculator the square of the standard deviation will give the variance.
A common symbol for the sample variance is s2.
The sample variance is a (sample) statistic and is therefore an estimate of the population variance.
See: measure of spread, sample standard deviation, variance
Curriculum achievement objectives references
Statistical investigation: Levels (7), (8)- Sampling distributionsearch for term
A theoretical distribution for the variation in the values of a sample statistic, such as a sample mean, based on samples of a fixed size, n. When the sample statistic is the sample mean, the sampling distribution is called the sampling distribution of the sample mean.
Example
Consider the mean of a random sample of 20 values taken from a population. Suppose that several more random samples of 20 values were taken from the same population and the sample mean for each sample was calculated. The values of these sample means would differ from sample to sample (illustrating sampling variation). Imagine repeating this process over and over again, without end. The variation in the values of these sample means is the sampling distribution of the sample mean.
Curriculum achievement objectives reference
Statistical investigation: Level 8- Sampling errorsearch for term
The error caused because data are collected from part of a population rather than the whole population.
An estimate of a population parameter, such as a sample mean or sample proportion, is different for different samples (of the same size) taken from the population. Sampling error is one of two reasons for the difference between an estimate and the true, but unknown, value of the population parameter. The other reason is non-sampling error.
The error for a given sample is unknown but when sampling is random, the size of the sampling error can be estimated by calculating the margin of error.
See: margin of error, non-sampling error
Curriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)
Statistical literacy: Levels 7, (8)- Sampling variationsearch for term
The variation in a sample statistic from sample to sample.
Suppose a sample is taken and a sample statistic, such as a sample mean, is calculated. If a second sample of the same size is taken from the same population, it is almost certain that the sample mean calculated from this sample will be different from that calculated from the first sample. If further sample means are calculated, by repeatedly taking samples of the same size from the same population, then the differences in these sample means illustrate sampling variation.
Alternative: chance variation
Curriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)
Probability: Levels (3), (4), (5), (6)- Scalessearch for term
See Linear scales.
- Scattersearch for term
For bivariate numerical data, the variation (in the vertical direction) of the values of the variable plotted on the y-axis of a scatter plot.
In linear regression, scatter is the variation (in the vertical direction) of the values of the response variable from the regression line.
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Scatter plotsearch for term
A graph for displaying a pair of numerical variables. The graph has two axes, one for each variable, and points are plotted to show the values of these two variables for each of the individuals.
A scatter plot is essential for exploring the relationship that may exist between the two variables and for revealing the features of this relationship.
In linear regression, at Level Eight, one of the two variables is regarded as the explanatory variable and the other variable as the response variable. In this case the explanatory variable is plotted on the horizontal axis (x-axis) and the response variable is plotted on the vertical axis (y-axis).
When fitting models to data, as in linear regression, a scatter plot is essential for assessing how useful the fitted model may be.
Example
The actual weights and self-perceived ideal weights of a random sample of 40 female university students enrolled in an introductory Statistics course at the University of Auckland are displayed on the scatter plot below.
Alternatives: scatter diagram, scattergram, scatter graphCurriculum achievement objectives references
Statistical investigation: Levels (4), (5), (6), (7), (8)- Seasonal component (for time-series data)search for term
Time-series data which have had the seasonal component removed. In seasonally adjusted data the effect of regular seasonal phenomena has been removed.
In terms of an additive model for time-series data, Y = T + S + C + I, where
T represents the trend component,
S represents the seasonal component,
C represents the cyclical component, and
I represents the irregular component;the smoothed series = T + C and
the seasonally adjusted series = T + C + I.Example
Statistics New Zealand’s Economic Survey of Manufacturing provided the following data on actual operating income for the manufacturing sector in New Zealand. Centred moving means have been calculated. For the quarters with centred moving means the individual seasonal effect is calculated by:
Operating income (raw data) – (centred) moving mean
The overall seasonal effect for each quarter is estimated by averaging the individual seasonal effects. The two individual seasonal effects for March quarters are –588.125 and –561.75. The mean of these 2 values is –574.938. The other estimated overall seasonal effects are shown in the second table below.
Seasonally adjusted data is calculated by:
Operating income (raw data) – estimated overall seasonal effect
The calculation for the Mar-05 quarter is 17322 – (–574.938) = 17896.938
Quarter Operating
income
($millions)Centred
moving mean
($millions)Individual
seasonal
effectSeasonally
adjusted
($millions)Mar-05
Jun-05
Sep-05
Dec-05
Mar-06
Jun-06
Sep-06
Dec-06
Mar-07
Jun-07
Sep-07
Dec-0717322
17696
17060
18046
17460
19034
18245
18866
18174
19464
18633
20616
17548.250
17732.750
18048.125
18298.750
18490.500
18633.500
18735.750
19003.000
-488.250
313.250
-588.125
735.250
-245.500
232.500
-561.750
461.000
17896.938
17097.875
17426.875
17773.125
18034.938
18435.875
18611.875
18593.125
18748.938
18865.875
18999.875
20343.125The raw data and the seasonally adjusted data are displayed below. Note that M, J, S and D indicate quarter years ending in March, June, September and December respectively.
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Seasonally adjusted datasearch for term
Time-series data which have had the seasonal component removed. In seasonally adjusted data the effect of regular seasonal phenomena has been removed.
In terms of an additive model for time-series data, Y = T + S + C + I, where
T represents the trend component,
S represents the seasonal component,
C represents the cyclical component, and
I represents the irregular component;the smoothed series = T + C and
the seasonally adjusted series = T + C + I.Example
Statistics New Zealand’s Economic Survey of Manufacturing provided the following data on actual operating income for the manufacturing sector in New Zealand. Centred moving means have been calculated. For the quarters with centred moving means the individual seasonal effect is calculated by:
Operating income (raw data) – (centred) moving mean
The overall seasonal effect for each quarter is estimated by averaging the individual seasonal effects. The two individual seasonal effects for March quarters are –588.125 and –561.75. The mean of these 2 values is –574.938. The other estimated overall seasonal effects are shown in the second table below.
Seasonally adjusted data is calculated by:
Operating income (raw data) – estimated overall seasonal effect
The calculation for the Mar-05 quarter is 17322 – (–574.938) = 17896.938
Quarter Operating
income
($millions)Centred
moving mean
($millions)Individual
seasonal
effectSeasonally
adjusted
($millions)Mar-05
Jun-05
Sep-05
Dec-05
Mar-06
Jun-06
Sep-06
Dec-06
Mar-07
Jun-07
Sep-07
Dec-0717322
17696
17060
18046
17460
19034
18245
18866
18174
19464
18633
20616
17548.250
17732.750
18048.125
18298.750
18490.500
18633.500
18735.750
19003.000
-488.250
313.250
-588.125
735.250
-245.500
232.500
-561.750
461.000
17896.938
17097.875
17426.875
17773.125
18034.938
18435.875
18611.875
18593.125
18748.938
18865.875
18999.875
20343.125The raw data and the seasonally adjusted data are displayed below. Note that M, J, S and D indicate quarter years ending in March, June, September and December respectively.
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Sequencesearch for term
A sequence is an ordered set (usually of numbers) arranged in such a way that the next element (or term) of the sequence is completely specified. E.g. 4, 7, 10, 13, … (The nth term (or general term) is given by the rule (3xn)+1 For example, the 5th term is (3x5)+1).
A finite sequence has a first and last term e.g. 2, 4, 6 , 8, …, 20
An infinite sequence continues indefinitely, e.g. 2, 4, 6, 8, …- Sequential patternsearch for term
A sequential pattern is a pattern whose terms change in an identifiable and consistent way. Examples are 2, 5, 8, 11, 14, 17, … and 1, 3, 6, 10,15, 21, … or a, b, d, g, k, p, …
The ordinal position of a term in a sequential pattern refers to its position in the pattern, that is, which term it is in the pattern. For example, in the pattern 2, 5, 8, 11, 14, 17, … the first term, T1 is 2; the second term T2 is 5 etc.- Seriessearch for term
A series is the sum of the terms of a sequence. For example, suppose we have the sequence 1/2, 1/4, 1/8, 1/16...
1/2 + 1/4 + 1/8 + 1/16 …is a series since it is the sum of the above sequence.
This is an infinite series since it continues indefinitely, but it is also a convergent series since it has a finite sum. The sum of this series is 1, which can be easily seen by observing the difference between 1 and the sum after one term, two terms, three terms etc.- Setsearch for term
A set is a collection of objects or ideas
- SIsearch for term
SI stands for Systeme Internationale d’Unites (International System of Units). We usually refer to this as the metric system. It is a system of weights and measures based on powers of ten and the weight of water. (See SI measurement units)
- SI measurement unitssearch for term
Most of the commonly used metric units and their abbreviations are as follows:
- Length: metre (m), kilometre (km) (1000 metres), centimetre (cm) (1/100 metre), millimetre (mm) (1/1000 metre)
- Volume: cubic metre (m3), cubic centimetre (cm3, or c.c.),
- Capacity: Units of volume may appropriately be used as units of capacity. A commonly used unit (especially with fluids) is the litre (l) (1000 cm3), and the millilitre (ml) (1 cm3). These units may also be used as units of volume.
- Weight: (See mass and weight). The basic unit of weight in common use is the gram (g). It is the weight of 1 cm3 of water. Other units are the kilogram (kg) (1000g, the weight of a litre of water), the milligram (mg) (1/1000 g), the tonne (1000 kg, the weight of a cubic metre of water).
- Area: : The basic unit of area is the square metre (m2). Other units in use are the square millimetre (mm2), the square centimetre (cm2), and the hectare (10,000 m2). So a hectare is the area of a square of sides 100 m.
- Significant figuressearch for term
Numerals are often expressed as approximate values The concept of significant figures (or significant digits) is an assessment of the accuracy of the numeral as an expression of the real value of the number that it represents. When a number is given in decimal notation, the error should not exceed a half unit of the last digit retained. Suppose a number has been rounded to 1400. We should be able to expect that before it was rounded to the nearest whole number it was greater than or equal to 1399.5 and less than 1400.5 Hence it should be no more than 0.5 different due to the rounding. Hence we can say that the four figures are reliable, and that the number has been expressed to four significant figures.
However, suppose we had been rounding to the nearest 100, which we might do in scientific or other practical situations. Then 1379 would also be written as 1400 but only the first two digits would be significant. Then we would know that the true value was greater than or equal to 1350 and less than 1449.
When numerals are written in standard form, it is easy to see how many significant figures they have. In the examples above, the first example of 1400 which was to four significant figures should be written as 1.400 x 103, while the second example of 1400 which was to two significant figures should be written as 1.4 x 103.
The numeral 2.57000 x 102 is intended to indicate that it has six significant figures.
Similarly 0.000378 can be written as 3.78 x 10-4 and has three significant figures.- Similaritysearch for term
Similar polygons are polygons whose corresponding angles are equal and whose corresponding sides are in proportion. For example, the two triangles below are similar because their corresponding angles are equal but they are not congruent.
Since the two triangles are similar their sides are in proportion and, since the vertical side of the triangle on the right is twice the length of the corresponding side of the triangle on the left, it follows that a = 6cm and b = 10 cm.- Simple additive strategiessearch for term
See additive strategies.
- Simple random samplesearch for term
A sample in which, at any stage of the sampling process, each object or individual (which has not been chosen) in the population has the same probability of being chosen in the sample.
In a simple random sample an object or individual in the population can be chosen once, at most. This is often called sampling without replacement.
See: random sample
Curriculum achievement objectives references
Statistical investigation: Levels (6), (7), (8)- Simulationsearch for term
A technique for imitating the behaviour of a situation that involves elements of chance or a probability activity. The technique uses tools such as coins, dice, random numbers from a calculator, random numbers from random number tables, and random numbers generated by computers.
Example 1
A coin can be used to simulate the outcomes of three-child families, assuming that a boy and a girl are equally likely to occur. If a head results from the coin toss then a boy is the simulated birth outcome and if a tail results then a girl is the simulated birth outcome. A group of three coin tosses simulates an outcome of a three-child family. The simulation is continued until the required number of trials has been performed.
Suppose the results of 90 coin tosses and therefore 30 simulated trials of three-child families were: HHT TTT HHT HTT HHT TTT THT TTH THT THT THT TTT HTH HTH HHH THT THT THH TTT HHH HHT TTH THT THH HTT THH THT HTH THH HHH
Trials:
BBG GGG BBG BGG BBG GGG GBG GGB GBG GBG GBG GGG BGB BGB BBB GBG GBG GBB GGG BBB BBG GGB GBG GBB BGG GBB GBG BGB GBB BBBThe experimental distribution for the variable that lists numbers of boys and girls in the family is shown in the frequency table or one-way table below:
Combination 3 boys 2 boys and 1 girl 1 boy and 2 girls 3 girls Frequency 3 11 12 4 Example 2
In a game of tennis one player from School A is to play one player from School B. School A has 3 players to choose from (C, D and E) and School B has 2 players to choose from (F and G). For School A, the probabilities of C, D or E being selected are 0.6, 0.3 and 0.1 respectively. For school B, the probabilities of F or G being selected are 0.7 and 0.3 respectively.
Simulate 25 performances (or trials) of this activity.
Suppose the random numbers to be used, starting at the beginning of this list, were:
71578 81355 39007 60764 19852 87652 50354 22183 14935 09519Consider the digits in pairs.
The first digit will decide the player for School A. If it is 0, 1, 2, 3, 4 or 5 then player C is chosen; if it is 6, 7 or 8 then player D is chosen; if it is 9 then player E is chosen.
The second digit will decide the player for School B. If it is 0, 1, 2, 3, 4, 5 or 6 then player F is chosen; if it is 7, 8 or 9 then player G is chosen.
Trial Pair Combination 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
2571
57
88
13
55
39
00
76
07
64
19
85
28
76
52
50
35
42
21
83
14
93
50
95
19D plays F
C plays G
D plays G
C plays F
C plays F
C plays G
C plays F
D plays F
C plays G
D plays F
C plays G
D plays F
C plays G
D plays F
C plays F
C plays F
C plays F
C plays F
C plays F
D plays F
C plays F
E plays F
C plays F
E plays F
C plays GThe experimental distribution for the variable that lists pairs of players is shown in the frequency table or one-way table below:
Combination C plays F C plays G D plays F D plays G E plays F E plays G Frequency 10 6 6 1 2 0 Curriculum achievement objectives references
Probability: Levels 7, (8)- Simultaneous equationssearch for term
A system of two linear equations in two unknowns is solved simultaneously when all ordered pairs are found which satisfy both equations. In the case of three linear equations in three unknowns the solution is the set of ordered triples that satisfy all three equations. The system itself is often referred to as a system of simultaneous equations. There are three different types of solution:
- The system may have a unique solution. For example:
3x+4y=17
x-y=1 This system has the unique solution (3,2) - The system may have an infinite solution. This occurs when one equation is a multiple of the other, or in the case of three equations in three unknowns, one equation is a combination of the other two. For example:
2x+y=6
4x+2y=12 The solution is {(t, -2t +6), t any real number}
This is an infinite solution. - The system may have no solution. The system is contradictory and is said to be inconsistent. The solution is the empty set ∅. For example:
2x+y=6
2x+y=8
Note that we cannot get two or three solutions. The possibilities are none, one or infinitely many solutions.
One equation non-linear A system of simultaneous equations may consist of only one linear equation and one other equation such as a quadratic equation. For example:
2x-y=1 and 3x2-xy+2y2=24
The linear equation can be used to express y in terms of x (or x in terms of y) and the resulting solutions back substituted into the linear equation.- The system may have a unique solution. For example:
- Sine rulesearch for term
In any acute triangle ABC the sine of an angle is proportional to the length of the side opposite the angle.
So a/sinA = b/sinB = c/sinC where a is the length of the side opposite angle A, b is the length of the side opposite angle B and c is the length of the side opposite angle C.
Care has to be taken when dealing with an obtuse triangle, because although the same relationship still holds, it is possible to become confused since, for example, sin 20o = sin 160o
(See also Trigonometric ratios)- Situations with elements of chancesearch for term
Situations with elements of chance occur when we are involved with nondeterministic, or random, events. So whenever an event is not strictly determined as to its outcome there is an element of chance. (See Probability of an event). Situations such as tossing a coin and determining whether it comes up ‘heads’ or ‘tails’, or rolling a dice and observing the outcome are situations with elements of chance.
- Skewnesssearch for term
A lack of symmetry in a distribution of a numerical distribution in which the values on one side of the distribution tend to be further from the centre of the distribution than values on the other side.
If the smaller values of a distribution tend to be further from the centre of the distribution than the larger values, the distribution is said to have negative skew or to be skewed to the left (or left-skewed).
If the larger values of a distribution tend to be further from the centre of the distribution than the smaller values, the distribution is said to have positive skew or to be skewed to the right (or right-skewed).
Example 1The actual weights of a random sample of 50 female university students enrolled in an introductory Statistics course at the University of Auckland are displayed on the dot plot below. The sample distribution is skewed to the right or positively skewed.
Example 2The bar graph displays the probability function of the binomial distribution with n = 10 and π = 0.8. The theoretical distribution is skewed to the left or negatively skewed.
Curriculum achievement objectives references
Statistical investigation: Levels (4), (5), (6), (7), (8)- Skip countingsearch for term
A counting sequence is an ordering of the counting numbers such that the difference between any two successive numbers is constant. We refer to a sequence in which the difference between any two successive numbers is greater than 1 as skip counting. An example of a backwards counting sequence is …, 50, 40, 30, 20, 10 , as is … 10, 8, 6, 4, 2 etc. An example of a forward counting sequence; is 1,3,5,7,9,… etc.
- Smoothing datasearch for term
A process of removing fluctuations from time-series data so that the resulting series shows much less variation, and is therefore smoother.
At Level Eight, moving averages (usually moving means) are used as a method of smoothing time-series data.
See: moving averages, moving mean, time-series data
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Sortingsearch for term
Putting objects into sets according to some chosen attribute, for example, the red ones, the square ones etc.
- Sources of variationsearch for term
The reasons for differences seen in the values of a variable. Some of these reasons are summarised in the following paragraphs.
Variation is present everywhere and is in everything. When the same variable is measured for different individuals there will be differences in the measurements, simply due to the fact that individuals are different. This can be thought of as individual-to-individual variation and is often described as natural or real variation.
Repeated measurements on the same individual may vary because of changes in the variable being measured. For example, an individual’s blood pressure is not exactly the same throughout the day. This can be thought of as occasion-to-occasion variation.
Repeated measurements on the same individual may vary because of some unreliability in the measurement device, such as a slightly different placement of a ruler when measuring. This is often described as measurement variation.
The difference in measurements of the same quantity for different individuals, apart from natural variation, could be due to the effect of one or more other factors. For example, the difference in growth of two tomato plants from the same packet of seeds planted in two different places could be due to differences in the growing conditions at those places, such as soil fertility or exposure to sun or wind. Even if the two seeds were planted in the same garden there could be differences in the growth of the plants due to differences in soil conditions within the garden. This is often described as induced variation.
Variation occurs in all sampling situations. Suppose a sample is taken and a sample statistic, such as a sample mean, is calculated. If a second sample of the same size is taken from the same population, it is almost certain that the sample mean calculated from this sample will be different from that calculated from the first sample. If further sample means are calculated, by repeatedly taking samples of the same size from the same population, then the differences in these sample means illustrate sampling variation.
Curriculum achievement objectives references
Statistical investigation: Levels 5, 6, (7), (8)- Spatial featuressearch for term
See Geometric properties.
- Spheresearch for term
A sphere with centre P is a solid such that every point on its surface is at an equal distance from P. A sphere may be considered as a solid that is generated by a circle that revolves about its diameter. A sphere has only one surface. The volume (V) of a sphere of radius r is given by V = 4/3 πr3 The surface area of a sphere is 4πr2
- Spreadsearch for term
The degree to which values in a distribution of a numerical variable differ from each other.
Alternative: dispersion
See: variability, variation
Curriculum achievement objectives references
Statistical investigation: Levels 5, (6), (7), (8)- Squaresearch for term
See rectangle.
- Square rootsearch for term
See Roots.
- Standard deviationsearch for term
A measure of spread for a distribution of a numerical variable that determines the degree to which the values differ from the mean. If many values are close to the mean then the standard deviation is small and if many values are far from the mean then the standard deviation is large.
It is calculated by taking the square root of the average of the squares of the deviations of the values from their mean.
It is recommended that a calculator or software is used to calculate the standard deviation.
The standard deviation can be influenced by unusually large or unusually small values. It is recommended that a graph of the distribution is used to check the appropriateness of the standard deviation as a measure of spread and to emphasise its meaning as a feature of the distribution.
The square of the standard deviation is equal to the variance.
Note that calculators have two keys for the two different ways the standard deviation can be calculated. One way divides the sum of the squared deviations by the number of values before taking the square root. The other way divides the sum of the squared deviations by one less than the number of values before taking the square root. At school level, it does not really matter which key is used because for all but quite small data sets the two values for the standard deviation will be similar. Software tends to use the calculation that divides by one less than the number of values; but some offer both ways. The first way (dividing by the number of values) is better when there are values for all members of a population and the second way is better when the values are from a sample.
Example
The maximum temperatures, in degrees Celsius (°C), in Rolleston for the first 10 days in November 2008 were: 18.6, 19.9, 20.6, 19.4, 17.8, 18.1, 17.8, 18.7, 19.6, 18.8
The standard deviation using division by 9 (one less than the number of values) is 0.93°C.
The standard deviation using division by 10 (the number of values) is 0.88°C.
The data, the mean and the standard deviation are displayed on the dot plot below.
See: measure of spread, population standard deviation, sample standard deviation, varianceCurriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)- Standard deviation (of a discrete random variable)search for term
A measure of spread for a distribution of a random variable that determines the degree to which the values differ from the expected value.
The standard deviation of random variable X is often written as σ or σX.
For a discrete random variable the standard deviation is calculated by summing the product of the square of the difference between the value of the random variable and the expected value, and the associated probability of the value of the random variable, taken over all of the values of the random variable, and finally taking the square root.
In symbols, σ =
An equivalent formula is, σ =
The square of the standard deviation is equal to the variance, Var(X) = σ2.
Example
Random variable X has the following probability function:
x 0 1 2 3 P(X = x) 0.1 0.2 0.4 0.3 A bar graph of the probability function, with the mean and standard deviation labelled, is shown below.
See: population standard deviation, standard deviation
Curriculum achievement objectives reference
Probability: Level 8- Standard errorsearch for term
A measure of spread for the values of an estimate, if the sampling method were repeated over and over. As such, a standard error is a measure of the precision of an estimate.
Standard error is used with two similar, but different, meanings. The first meaning is the standard deviation of an estimate. The second meaning is an estimated standard deviation of an estimate.
Estimates vary from sample to sample so that, in general, an estimate is a random variable. The first meaning of standard error is the standard deviation of the random variable representing an estimate.
The standard deviation of the random variable representing an estimate is usually not useful because it depends on the (usually) unknown value of a population parameter. The second meaning of standard error is an estimated standard deviation of the random variable representing an estimate.
Unfortunately, both meanings are in common usage.
Example
Consider the sampling distribution of the sample mean and let random variable
represent the sample mean.
The standard deviation of
is First meaning: The standard error of the sample mean is
.
The population standard deviation, σ, is not known.
Suppose for a sample of size 20,
= 49.23 and s = 13.63.
By replacing σ with s the standard error is estimated to be
Second meaning: By replacing σ with s the standard error of the sample mean is
The standard error of the sample mean is
Curriculum achievement objectives references
Statistical investigation: (Level 8)
Statistical literacy: (Level 8)- Standard formsearch for term
A number in decimal form may be written as the product of a number greater than 1 and less than 10, and a power of 10. A number written that way is said to be in standard form Standard form is also referred to as scientific notation. So, for example,
- Standard normal distributionsearch for term
The normal distribution with a mean of 0 and a standard deviation of 1.
Curriculum achievement objectives references
Probability: Levels (7), (8)- Statisticsearch for term
A number that is calculated from numerical data.
Statistics listed in this glossary are: mean, median, mode, standard deviation, variance, interquartile range, range, lower quartile, upper quartile.
Alternative: summary statistic
See: sample statistic
Curriculum achievement objectives references
Statistical investigation: Levels (6), (7), (8)
Statistical literacy: Levels 6, (7), (8)- Statistical distributionssearch for term
When summarising large masses of raw data it is often useful to distribute the data into classes, or categories, and to determine the number of data points belonging to each class, called the class frequency. A tabular arrangement of data by classes together with the corresponding class frequencies is called a frequency distribution or frequency table.
- Statistical enquiry cyclesearch for term
A cycle that is used to carry out a statistical investigation. The cycle consists of five stages: Problem, Plan, Data, Analysis, Conclusion. The cycle is sometimes abbreviated to the PPDAC cycle.
The problem section is about formulating a statistical question, what data to collect, who to collect it from and why it is important.
The plan section is about how the data will be gathered.
The data section is about how the data is managed and organised.
The analysis section is about exploring and analysing the data, using a variety of data displays and numerical summaries, and reasoning with the data.
The conclusion section is about answering the question in the problem section and giving reasons based on the analysis section.
Reference: www.censusatschool.org.nz/2005/documents/how-kids-learn.pdf
Curriculum achievement objectives references
Statistical investigation: All levels- Statistical experimentsearch for term
A statistical experiment is a random or nondeterministic experiment. Its features are that:
- each experiment is capable of being repeated indefinitely under essentially unchanged conditions.
- Although we are in general not able to state what a particular outcome will be, we are able to describe the set of all possible outcomes of the experiment
- As the experiment is performed repeatedly, the individual
outcomes seem to occur in a haphazard manner. However as the experiment is repeated a large number of times, a definite pattern or regularity appears.
- Statistical inferencesearch for term
The process of drawing conclusions about population parameters based on a sample taken from the population.
Example 1
Using a sample mean calculated from a random sample taken from a population to estimate the population mean is an example of statistical inference.
Example 2
Using data from a random sample taken from a population to obtain a 95% confidence interval for the population proportion is an example of statistical inference.
Alternative: inference
Curriculum achievement objectives references
Statistical investigation: Levels 6, 7, 8- Statistical investigationsearch for term
An information gathering and learning process that is undertaken to seek meaning from and to learn more about any aspect of the real world, as well as to help make informed decisions and take informed actions. Statistical investigations should use the statistical enquiry cycle (Problem, Plan, Data, Analysis, Conclusion).
Reference: www.censusatschool.org.nz/2005/documents/statistical-investigation.pdf
See: statistical enquiry cycle
Curriculum achievement objectives references
Statistical investigation: All levels
Statistical literacy: Levels 1, 2, 3, 4, 5- Stem-and-leaf plotsearch for term
A graph for displaying the distribution of a numerical variable that is similar to a histogram but retains some information about individual values.
Ideally the numbers in the ‘stem’ represent the highest place-value digit in the values and the ‘leaves’ display the second highest place-value digits in each individual value.
To compare the distribution of a numerical variable for two categories of a category variable, a back-to-back stem-and-leaf plot can be drawn, in which the stem is placed at the centre and the leaves for the values of the numerical variable for one category are drawn on one side of the stem and the leaves for the other category are drawn on the other side.
Stem-and-leaf plots are particularly useful when the number of values to be plotted is not large.
Example 1
The actual weights of a random sample of 40 male university students enrolled in an introductory Statistics course at the University of Auckland are displayed on the stem-and-leaf plot below.
Actual weights of male university students (kg)
5 | 1577
6 | 0000002223557889
7 | 00012233455
8 | 00344589999
9 | 008
10 | 0009
11 |
12 | 0
The stem unit is 10kg
Example 2 (Back-to-back stem-and-leaf plot)The actual weights of random samples of 40 female and 40 male university students enrolled in an introductory Statistics course at the University of Auckland are displayed on the back-to-back stem-and-leaf plot below.
Actual weights of university students (kg)
Females Males
9 | 3 |
99988876 | 4 |
8876665555554432220000 | 5 | 1577
88542200 | 6 | 0000002223557889
5200 | 7 | 00012233455
5550 | 8 | 00344589999
330 | 9 | 008
| 10 | 0009
| 11 |
| 12 | 0
The stem unit is 10kg
Alternative: stem plotCurriculum achievement objectives references
Statistical investigation: Levels (4), (5), (6), (7), (8)- Stratified samplingsearch for term
A method of sampling in which the population is split into non-overlapping groups (the strata), with the groups having different characteristics that are known for the whole population. A simple random sample is taken from each stratum.
Example
Consider obtaining a sample of students from a secondary school with students from Year 9 to Year 13. This year levels are suitable strata, and the simple random samples taken from each year level from the sample.
Curriculum achievement objectives references
Statistical investigation: Levels (7), (8)- Strength of evidencesearch for term
An assessment of how well data, collected to investigate an assertion or question, support the assertion or support a conclusion to the question. The assertion or question usually involves a comparison of a numerical variable between two categories of a category variable (i.e., that there is a link between the numerical variable and the category variable).
Data are collected to investigate the assertion or question. An estimate of a population parameter is calculated from the data. This observed estimate is often a difference between means but could be a difference between two proportions or a slope of a fitted regression line.
Could an estimate as big as the observed estimate be produced just by chance?
To answer this question, the effect of sampling variation alone on the estimate needs to be considered when it is assumed that there is no link between the two variables. If random allocation alone could easily produce an estimate as big as the observed estimate then the data cannot be interpreted as support for the existence of a link between the two variables. Values of the numerical variable obtained from the data collection are randomly allocated to the two categories of the category variable. An estimate is calculated from this ‘resampling using randomisation’ process. This process is repeated many times to form a distribution of estimates under sampling variation alone.
By comparing the observed estimate with the distribution of estimates, an assessment can be made of the strength of evidence the data provide for the assertion or provide for a conclusion to the question. This assessment is made by looking at the percentage of estimates under sampling variation alone that are at least as far from zero as the observed estimate.
If less than about 0.1% of the estimates produced by random allocation alone are at least as far from zero as the observed estimate then the data provide very strong evidence of a link between the two variables.
If about 1% of the estimates produced by random allocation alone are at least as far from zero as the observed estimate then the data provide strong evidence of a link between the two variables.
If about 5% of the estimates produced by random allocation alone are at least as far from zero as the observed estimate then the data provide some evidence of a link between the two variables.
If about 10% of the estimates produced by random allocation alone are at least as far from zero as the observed estimate then the data provide weak evidence of a link between the two variables.
If more than about 12% of the estimates produced by random allocation alone are at least as far from zero as the observed estimate then the data provide no evidence of a link between the two variables.
Two examples of using resampling using randomisation are provided in the description of randomisation. These examples include a conclusion about the strength of evidence the data provide for a link between the two variables.
See: randomisation, resampling
Curriculum achievement objectives reference
Statistical investigation: Level 8- Strip graphsearch for term
A graph for displaying the distribution of a category variable or a whole-number variable that uses parts of a rectangular strip to represent the frequencies for each category or value.
Example
A student collected data on the colour of cars that drove past her house and displayed the results on the strip graph below.

Alternative: segmented bar graphCurriculum achievement objectives references
Statistical investigation: Levels (2), (3), (4), (5), (6)- Subtraction ofsearch for term
- Whole numbers: Subtraction is an operation of decomposition. On whole numbers, subtraction may be described as the separating of a set into two disjoint sets. In a physical model it is represented by the separating of objects. Subtraction is the inverse operation of addition Consequently, the basic addition facts give rise to families of facts that also involve subtraction. For example, 4+5 = 9 has three other family members, namely, 5+4 = 9, 9-5 = 4 and 9-4 = 5.
Subtraction is a binary operation, that is, it is an operation on two numbers.
Subtraction is not commutative, that is, the order of the numbers does matter. For example, 7-3 ≠3-7.
Subtraction is not associative. The grouping of the numbers does affect the answer. For example, (10-5) -2 ≠ 10- (5-2) - Fractions: Fractions may be subtracted. If their denominators are the same then we can simply subtract the numerators to obtain the difference. For example, 5/9 - 2/9 = 3/9. If their denominators are not the same then we must choose equivalent fractions so that their denominators are the same. For example, to subtract 1/6 from 1/4 we must find equivalent fractions for 1/6 and 1/4 that have a common denominator. We could multiply the two denominators, 6 and 4, and that process would always give us a common denominator. However, we might observe that the least common multiple of 6 and 4 is actually 12. 1/6 = 2/12, 1/4 = 3/12, so 1/4 - 1/6 = 1/12.
- Decimals: Decimal fractions (commonly called decimals) may be subtracted in the same way that whole numbers are, with care being taken to consider the position of the decimal point. So, for example, tenths are subtracted from tenths, hundredths are subtracted from hundredths, etc.
- Percentages: Percentages may be subtracted as if they were whole numbers or decimals. For example, 15%-7%=8%, 2.4%-1.2%=1.2.%, At an abstract level, a percentage is a numeral representing a real number and therefore, just like decimals, they may be subtracted. Care must be taken however because of the way that society uses percentages. One often refers to a percentage of something and that can lead to difficulties. For example, although it is true that 10% of 80 minus 5% of 80 is 5% of 80, it is not true that 10% of 80 minus 5% of 60 is 5% of either 80 or 60.
- Integers: Integers may be subtracted by observing the following rule:
a--b=a+b. For example, 4--3=4+3=7 This rule is best discovered using models, such as a black-and-white counters model, in which a white counter represents one, and a black counter represents -1. The number line can also be helpful.
The properties of subtraction outlined for whole numbers also apply to the subtraction of fractions, decimals, percentages, and integers.
- Whole numbers: Subtraction is an operation of decomposition. On whole numbers, subtraction may be described as the separating of a set into two disjoint sets. In a physical model it is represented by the separating of objects. Subtraction is the inverse operation of addition Consequently, the basic addition facts give rise to families of facts that also involve subtraction. For example, 4+5 = 9 has three other family members, namely, 5+4 = 9, 9-5 = 4 and 9-4 = 5.
- Summary statisticssearch for term
Numbers calculated from numerical data that are used to summarise the data. The statistics will usually include at least one measure of centre and at least one measure of spread.
Alternatives: descriptive statistics, numerical summary
See: sample statistics
Curriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)- Surveysearch for term
A systematic collection of data taken by questioning a sample of people taken from a population in order to estimate a population parameter.
Alternative: sample survey
See: poll
Curriculum achievement objectives references
Statistical investigation: Levels 5, (6), 7, 8
Statistical literacy: Levels 7, 8- Symbolssearch for term
Mathematics has developed from its early rhetorical phase where all words relating to operations were written, to its present symbolic phase, where various marks have meanings describing number, operations and relations. Some common symbols are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, +, -, x, ÷, <, >, =, π, etc.
- Symmetric Patternssearch for term
The essential feature of a symmetric geometric pattern is that it can be divided into two or more identical parts, and furthermore that these parts are systematically disposed to one another. In addition, some patterns, such as frieze patterns, will have repetitive elements.
Symmetry means that the parts of a figure are not only congruent but related by an isometry in such a way that the whole figure is self-coincident under that isometry. That is, the whole figure maps onto itself under that isometry. Symmetry of reflection and rotation can be found in many objects and patterns. Kowhaiwhai are an excellent example of frieze patterns and all seven different types of frieze pattern are found in them. They can be analysed in terms of a fundamental region (the smallest region that can generate the pattern) and the isometric transformations acting on the fundamental region to generate the whole pattern. All four isometries are found in the Kowhaiwhai.- Symmetrysearch for term
A property of a distribution of a numerical variable when the values below the centre of the distribution are distributed in the same way as the values above the centre.
Many theoretical distributions are not symmetrical. For example, all Poisson distributions are not symmetrical.
Frequency distributions from experiments or samples (i.e., experimental distributions or sample distributions are unlikely to show perfect symmetry. This may be because the distribution of the population from which the values came is not symmetrical. Alternatively, if the distribution of the population from which the values came is symmetrical, then the presence of sampling variation will cause the frequency distribution to not be perfectly symmetrical.
Example (A symmetrical theoretical discrete distribution)
The bar graph displays the probability function of the binomial distribution with n = 10 and π = 0.5. The graph is symmetrical.
Curriculum achievement objectives references
Statistical investigation: Levels (4), (5), (6), (7), (8)- Systematic samplingsearch for term
A method of sampling from a list of the population so that the sample is made up of every kth member on the list, after randomly selecting a starting point from 1 to k.
Example
Consider choosing a systematic sample of 20 members from a population list numbered from 1 to 836.
To find k, divide 836 by 20 to get 41.8.
Rounding gives k = 42.
Randomly select a number from 1 to 42, say 18.
Start at the person numbered 18 and then choose every 42nd member of the list.
The sample is made up of those numbered:
18, 60, 102, 144, 186, 228, 270, 312, 354, 396, 438, 480, 522, 564, 606, 648, 690, 732, 774, 816Sometimes rounding may cause the sample size to be one more or one less than the desired size.
Curriculum achievement objectives references
Statistical investigation: Levels (7), (8)t- Tally chartsearch for term
A table used to record values for a variable in a data set, by hand, often as the values are collected. One tally mark is used for each occurrence of a value. Tally marks are usually grouped in sets of five, to aid the counting of the frequency for each value.
A tally chart provides an immediate visual form of the distribution.
Example
The number of days in a week that rain fell in Grey Lynn, Auckland, from Monday 2 January 2006 to Sunday 31 December 2006 is recorded in the tally chart below.
The tally chart can then be re-drawn as the following frequency table.
Number of days with rain Number of weeks 0
1
2
3
4
5
6
72
5
5
5
19
6
6
4Total 52
Curriculum achievement objectives references
Statistical investigation: Levels (1), (2), (3), (4), (5), (6)- Temperaturesearch for term
Temperature is a physical property of a system that underlies the common notions of hot and cold – something that is hotter has the greater temperature. The temperature of a body is a measure of its relative hotness or coldness. Temperature is usually measured in degrees Celsius (oC), a measurement system in which water freezes at 0oC and boils at 100oC.
- Theoretical distributionsearch for term
A model for the variation in the values of a variable based on defining the probabilities of values (or intervals of values) of a variable.
Example
The Poisson distribution is a theoretical distribution.
Curriculum achievement objectives references
Statistical investigation: (Level 8)
Probability: Levels 5, 6, 7, (8)- Three-dimensionalsearch for term
A shape is three dimensional if it occupies a portion of space. If a shape has volume then it is three-dimensional. It is called three-dimensional because any point in space (often called 3-space) can be described by distances in three independent directions from a fixed point. (See Relating three-dimensional models to two-dimensional representations and vice-versa)
- Timesearch for term
Time is a fundamental property of physics and can be described only in terms of its measurement. It is defined in terms of the length of a mean solar day, i.e. the average duration of one rotation of Earth with respect to the sun. The basic unit of time is the second, which is defined as 1/86,400 of a day. Other units in common use are the minute (60 seconds) and the hour (60 minutes).
- Time-series datasearch for term
A data set gathered over time. For one object, such as climate in Rolleston, Canterbury, the values of a variable (or several variables) are obtained at successive times. Usually there are equal intervals between the successive times.
Example
The maximum temperature, rainfall, maximum atmospheric pressure and maximum wind gust speed in Rolleston, recorded daily.
Note: At Level Eight a common approach to modelling time-series data considers the data to have four components; trend component, cyclical component, seasonal component and irregular component.
See: additive model (for time-series data)
Curriculum achievement objectives references
Statistical investigation: Levels 3, 4, (5), (6), (7), 8- Timetablessearch for term
Timetables are tables of data that involve time as one of the data measures. Timetables can contain much mathematical information. Tables such as bus timetables usually contain patterns that can be explored as a part of mathematics study. Interpretation of the timetable and its patterns develops the ability to read such tables.
- Transformationsearch for term
A transformation of the plane or of space is a function (or mapping) that maps the points of the plane or of space to points of the plane or of space. Common transformations are translation, reflection and rotation. See also Multiple transformations.
- Translationsearch for term
The movement of a figure in the (or in space) such that every point moves the same distance and in the same direction. Hence a translation is a shape-preserving (isometric) transformation that involves no rotation (turning) or reflection.
- Trapeziumsearch for term
A trapezium is a quadrilateral (four-sided polygon) having exactly one pair of parallel sides. The area of a trapezium is the mean of the lengths of the parallel sides multiplied by the distance between them.
- Treatmentsearch for term
In an experiment, the value of the explanatory variable that is chosen by the researcher to be given to each individual in a group.
See: experiment
Curriculum achievement objectives references
Statistical investigation: Levels (7), (8)
Statistical literacy: (Level 8)- Tree diagramsearch for term
A diagram used to represent the possible outcomes in a probability activity that has more than one stage.
From a single starting point, a branch is drawn to represent the outcomes of the first stage. From the end of each branch, a second branch is drawn to represent the outcomes of the second stage, and so on. From the starting point, each path through the tree represents an outcome of the whole activity.
A tree diagram can be a useful tool for obtaining a systematic list all of the possible outcomes of a probability activity that involves two or three stages (see Example 1). The use of tree diagrams is usually restricted to two-stage or three-stage probability activities because they become too complicated when the total number of outcomes is large.
If the outcomes at each stage are not equally likely to occur then, for each stage, the probability of each outcome for each stage is written on a branch.
Example 1 (Outcomes only, no probabilities on branches)
In a game of tennis one player from School A is to play one player from School B. School A has 3 players to choose from (C, D and E) and School B has 2 players to choose from (F and G). If each player has an equal chance of being selected to play for their school, list all of the different possible combinations of games.
Example 2 (Independent stages, probabilities on branches)In a game of tennis one player from School A is to play one player from School B. School A has 3 players to choose from (C, D and E) and School B has 2 players to choose from (F and G). For School A, the probabilities of C, D or E being selected are 0.6, 0.3 and 0.1 respectively. For school B, the probabilities of F or G being selected are 0.7 and 0.3 respectively. List all of the different possible combinations of games with their probabilities. Assume that choosing a player from School A is independent of choosing a player from School B.
Example 3 (Conditional stages, probabilities on branches)A jar contains 10 balls, 7 are blue and 3 are red. A ball is randomly taken from the jar and its colour is noted. The ball is not placed back in the jar and a second ball is randomly taken from the jar. List all of the different possible outcomes of this probability activity with their probabilities.
Curriculum achievement objectives references
Probability: Levels 7, (8)- Trendsearch for term
A general tendency among variables in a data set; usually between pairs of variables.
For two numerical variables, as values of one variable increase the trend is any general tendency of the change in the values of the other variable. See Example 1 below.
For two category variables, both of which have a natural ordering of their categories, as transitions are made from the lowest to the highest category, the trend is any general tendency of the changes in the categories of the other variable. See Example 2 below.
For a category variable that has a natural ordering of its categories and a numerical variable, as transitions are made from the lowest to the highest category for the category variable the trend is any general tendency of the changes in the values of the numerical variable. See Example 3 below.
For time-series data the trend is any general tendency to change with time. See Example 4 below.
Example 1 (Two numerical variables)
Data were selected for 86 New Zealand school students from the CensusAtSchool website. The scatter plot below displays the data for their height and right foot length, both in centimetres.
As the length of the right foot increases there is a general tendency for height to increase. The trend is that, generally, an increase is right foot length is associated with an increase in height.Example 2 (Two category variables, with a natural ordering of categories)
Data were selected for 86 New Zealand school students from the CensusAtSchool website. Two of the variables were their year level (5 or 6, 7 or 8, 9 or 10) and their usual level of lunchtime activity (Sit or stand, Walk, Run). The data are displayed in the two-way table and bar graph below. The table shows frequencies for each cell, as well as row proportions for each of the three groups of year levels.
As we move from year levels 5 or 6 to year levels 9 or 10 there is a general tendency for the proportion running during lunchtime to decrease and the proportion sitting or standing to increase. The trend is that for higher year levels, generally, there is an increase in less vigorous forms of lunchtime activity.Example 3 (One numerical variable and one category variable with a natural ordering of categories)
Data were selected for 86 New Zealand school students from the CensusAtSchool website. The dot plot below displays the data for their heights, in centimetres, for three groups of year levels.
.jpg)
As we move from year levels 5 or 6 to year levels 9 or 10 there is a general tendency for height to increase. The trend is for students at higher year levels to be taller, in general.Example 4 (Time-series data)
Statistics New Zealand’s Economic Survey of Manufacturing provided the following data on actual operating income for the manufacturing sector in New Zealand for each quarter from September 2002 to September 2008. Note that M, J, S and D indicate quarter years ending in March, June, September and December respectively.
Over time there is a general tendency for the operating income to increase. The trend is that as time goes by, generally, there is an increase in operating income.Curriculum achievement objectives references
Statistical investigation: Levels 3, 4, 5, 6, (7), (8)- Trend component (for time-series data)search for term
The general tendency in time-series data. The trend component is the slow variation in the time series over a long period of time, relative to the interval between the successive times.
See: time-series data
Curriculum achievement objectives reference
Statistical investigation: (Level 8)- Trianglesearch for term
A triangle is a polygon with three sides, that is, a portion of the plane bounded by three straight lines. The interior angles of a triangle add to 180o a fact that can be easily shown by drawing a triangle on paper, cutting the triangle out, tearing off the corners and putting the vertex angles together. A vertex of a triangle is the point where two of the sides meet.
Triangles can be classified by their sides as follows:
Scalene triangle – no sides are equal
Isosceles triangle – at least two sides are equal
Equilateral triangle – all three sides are equal. The equilateral triangle is therefore a special case of the isosceles triangle.
They can also be classified according to the kind of angles they have:
Right triangle – one angle a right angle
Obtuse triangle – has an obtuse angle, that is, an angle greater than 90o but less than 180o.
Acute triangle – a triangle with three acute angles, that is, angles that are less than 90o.
The area of a triangle is half of the length of the base multiplied by the vertical height. This can be discovered by finding the area of a parallelogram as described (See parallelogram) and realising that every triangle can be obtained by bisecting a parallelogram. Hence the area of a triangle is half the area of the associated parallelogram.- Trigonometric equationssearch for term
In trigonometric equations the unknown appears in the form of a trigonometric function, or functions. Simple or basic trigonometric equations are trigonometric equations in which only one kind of trigonometric function is present. For example, cos 3x/2 = 1
Care must be taken to allow for all possible solutions.
If cos 3x = 1/2 then 3x = arccos 1/2
So x = ± π/9 + 2nπ/3, n an integer.- Trigonometric ratiossearch for term
If two triangles have equal angles then the lengths of corresponding sides will be in proportion. Triangles that have equal angles are called similar triangles. The lengths of any two corresponding sides in two similar triangles will be in proportion even though they may not be equal. So if one side in the larger triangle is double the length of the corresponding side in the smaller triangle then the other two sides in the larger triangle will also be double the length of the corresponding sides in the smaller triangle. This property is used in defining the fundamental trigonometric functions of angles.
The two right triangles above are similar triangles. AC and DF are the two hypotenuses. CB is the side opposite angle A and FE is the side opposite angle D. AB is the side adjacent to angle A and DE is the side adjacent to angle D. The fundamental trigonometric functions are defined as follows:
The sine of A: sin A = length of CB divided by length of AC = sin D
The cosine of A: cos A = length of AB divided by length of AC = cos D
The tangent of A: tan A = length of CB divided by length of AB = tan D
The cosecant of A: cosec A = length of AC divided by length of CB
The secant of A: sec A = length of AC divided by length of AB
The cotangent of A cot A = length of AB divided by length of CB
The numerical value of these trigonometric functions depends only on the size of the acute angles in the right triangle and not on the ‘size’ of the triangle.
These functions may be summarised as follows:
sin = opposite/hypotenuse
cos = adjacent/hypotenuse
tan = opposite/adjacent
cosec = hypotenuse/opposite
sec = hypotenuse/adjacent
cot = adjacent/opposite
So cosec = 1/sin
sec = 1/cos
cot = 1/tan
The inverse functions are arcsin, arccos, arctan etc.- Turnsearch for term
See rotation.
- Two-dimensionalsearch for term
A shape is two dimensional if it can be made to lie wholly in a plane. Hence a two dimensional shape has area but no volume. It is called two-dimensional because any point in a plane can be described by distances from a fixed point in two independent directions, such as in the Cartesian plane (See Coordinate systems).
- Two-dimensional representations of three-dimensional solidssearch for term
See Relating three-dimensional models to two-dimensional representations and vice-versa.
- Two-way tablesearch for term
A table in which the rows represent the categories for one category variable, the columns represent the categories of a second category variable and each cell displays the frequency (or proportion) resulting for that row and column combination for the two variables.
Example
Data were collected from answers to an online questionnaire from 727 students enrolled in an introductory Statistics course at the University of Auckland. Two of the variables of interest are the gender of the student and the course in which they which they were enrolled (STATS 101, STATS 102 or STATS 108). The following two-way table was formed by counting the number of students falling into each combination of categories of the two variables.

Alternative: contingency tableCurriculum achievement objectives references
Probability: Levels 7, (8)u- Uncertaintysearch for term
If the probability of an event is neither 0 (impossible) nor 1 (certain) then there is an element of uncertainty involved. The event may or may not occur. So if today is Wednesday then we can be certain that tomorrow will be Thursday. But if it hasn’t rained in Alice Springs for three years we cannot be certain that it will not rain tomorrow. Children often confuse certainty with events that are highly likely to occur. For example, if the All Blacks are playing Botswana then a large proportion of children will claim that an All Black win is certain even though there is obviously an element of uncertainty in the outcome.
- Units of measurementsearch for term
A unit of measurement is an item or quantity that has the attribute being measured and which can be compared with the object being measured. For example, the length of a desk could be measured in handspans or in centimetres. Both have the attribute of length; the handspan is a nonstandard unit of measurement, the centimetre a standard unit of measurement. (For a list of the commonly used SI units see SI measurement units)
- Upper quartilesearch for term
See: quartiles
Curriculum achievement objectives references
Statistical investigation: Levels (5), (6), (7), (8)v- Variabilitysearch for term
The tendency for a property to have different values for different individuals or to have different values at different times.
Curriculum achievement objectives references
Statistical investigation: Levels (1), (2), (3), (4), (5), (6), 7, (8)
Probability: Levels (1), (2), (3), (4), (5), (6), (7), (8)- Variablesearch for term
A property that may have different values for different individuals or that may have different values at different times.
Curriculum achievement objectives references
Statistical investigation: Levels 4, 5, 6, 7, (8)
Probability: Levels (7), 8- Variancesearch for term
A measure of spread for a distribution of a numerical variable that determines the degree to which the values differ from the mean. If many values are close to the mean then the variance is small and if many values are far from the mean then the variance is large.
It is calculated by the average of the squares of the deviations of the values from their mean.
The variance can be influenced by unusually large or unusually small values.The positive square root of the variance is equal to the standard deviation.
It is recommended that a calculator or software is used to calculate the variance. On a calculator the square of the standard deviation will give the variance.
See: measure of spread, sample variance, standard deviation
Curriculum achievement objectives references
Statistical investigation: Levels (7), (8)- Variance (of a discrete random variable)search for term
A measure of spread for a distribution of a random variable that determines the degree to which the values of a random variable differ from the expected value.
The variance of random variable X is often written as Var(X) or σ2 or
.
For a discrete random variable the variance is calculated by summing the product of the square of the difference between the value of the random variable and the expected value, and the associated probability of the value of the random variable, taken over all of the values of the random variable.
In symbols, Var(X) =
An equivalent formula is, Var(X) = E(X2) – [E(X)]2
The square root of the variance is equal to the standard deviation.
Example
Random variable X has the following probability function:
x 0 1 2 3 P(X = x) 0.1 0.2 0.4 0.3
Using Var(X) =
µ = 0 x 0.1 + 1 x 0.2 + 2 x 0.4 + 3 x 0.3
= 1.9Var(X) = (0 – 1.9)2 + (1 – 1.9)2 + (2 – 1.9)2 + (3 – 1.9)2
= 0.89
Using Var(X) = E(X2) – [E(X)]2E(X) = 0 x 0.1 + 1 x 0.2 + 2 x 0.4 + 3 x 0.3
= 1.9E(X2) = 02 × 0.1 + 12 × 0.2 + 22 × 0.4 + 32 × 0.3
= 4.5Var(X) = 4.5 – 1.92
= 0.89See: population variance, variance
Curriculum achievement objectives reference
Probability: (Level 8)- Variationsearch for term
The differences seen in the values of a property for different individuals or at different times.
Curriculum achievement objectives references
Statistical investigation: Levels 4, 5, 6
Probability: Levels (1), (2), (3), 4, 5, (6), (7), (8)- Views and pathways from locations on a mapsearch for term
Children can be asked questions regarding a map such as: "If you were standing at the corner of Smith and Jones Streets looking towards the water tower, what would you see on your left?" For more advanced learners, questions such as: "If you started at position (3,5) and you travelled northeast for 6 km what would you find?" etc.
- Volumesearch for term
Volume is a measure of the amount of space that a three-dimensional object occupies. Lines and planes are considered to have no volume. The basic unit of measurement of volume is the cubic metre (m3), with cubic centimetre (cm3 or c.c.) also in common use. Units used more commonly as units of capacity may also be used, in particular the litre (l) (1000 cm3) and the millilitre (ml) (1cm3), (See SI measurement units)
- Volumes of cuboids, prisms etc.search for term
-
w
- Weightsearch for term
The measure of the heaviness of an object. It is the force that results from the action of gravity on matter. The term ‘weight’ is often used when strictly speaking ‘mass’ is meant. The distinction between mass and weight is unimportant for most practical purposes and we commonly use units of mass (kilogram, gram, tonne etc.) as units of weight. (See SI measurement units)
- Whole Numberssearch for term
{0,1,2,3,…} is the set of whole numbers. Hence the whole numbers are all the natural numbers as well as the number zero. (See Base ten numeration system)
- Whole-number datasearch for term
Data in which the values result from counting, or from measuring with the values rounded to a whole number.
Example 1 (Values from counting)
The number of students absent from each class in a primary school on a particular day.
Example 2 (Values from measuring)
The heights of a class of Year 9 students recorded to the nearest centimetre.
See: numerical data, quantitative data
Curriculum achievement objectives references
Statistical investigation: Levels 2, 3, (4), (5), (6), (7), (8)- Whole-number variablesearch for term
A property that may have different values for different individuals and for which these values result from counting, or from measuring with the values rounded to a whole number.
Example 1 (Values from counting)
The number of students absent from each class in a primary school on a particular day.
Example 2 (Values from measuring)
The heights of a class of Year 9 students recorded to the nearest centimetre.
See: numerical variable, quantitative variable
Curriculum achievement objectives references
Statistical investigation: Levels (4), (5), (6), (7), (8)



