Introduction to Statistic
Statistics represents a body of knowledge which enables one to deal with quantitative data reflecting any degree of uncertainty. There are six basic aspects of applied statistics.
These are:
1. Type of data
2. Random variables
3. Models
4. Parameters
5. Sample statistics
6. Characterization of chance occurrences
From these can be developed strategies and procedures for dealing with (1) estimation and (2) inferential statistics. The following has been directed more toward inferential statistics because of its broader utility.
Detailed illustrations and examples are used throughout to develop basic statistical methodology for dealing with a broad area of applications. However, in addition to this material, there are many specialized topics as well as some very subtle areas which have not been discussed. The references should be used for more detailed information.
Type of Data In general, statistics deals with two types of data: counts and measurements. Counts represent the number of discrete outcomes, such as the number of defective parts in a shipment, the number of lost-time accidents, and so forth. Measurement data are treated as a continuum. For example, the tensile strength of a synthetic yarn theoretically could be measured to any degree of precision.
A subtle aspect associated with count and measurement data is that some types of count data can be dealt with through the application of techniques which have been developed for measurement data alone. This ability is due to the fact that some simplified measurement statistics serve as an excellent approximation for the more tedious count statistics.
Random Variables Applied statistics deals with quantitative data. In tossing a fair coin the successive outcomes would tend to be different, with heads and tails occurring randomly over a period of time. Given a long strand of synthetic fiber, the tensile strength of successive samples would tend to vary significantly from sample to sample.
Counts and measurements are characterized as random variables, that is, observations which are susceptible to chance. Virtually all quantitative data are susceptible to chance in one way or another. Models Part of the foundation of statistics consists of the mathematical models which characterize an experiment.
The models themselves are mathematical ways of describing the probability, or relative likelihood, of observing specified values of random variables. For example, in tossing a coin once, a random variable x could be defined by assigning to x the value 1 for a head and 0 for a tail. Given a fair coin, the probability of observing a head on a toss would be a .5, and similarly for a tail. Therefore, the mathematical model governing this experiment can be written as in the table below:
where P(x) stands for what is called a probability function. This term is reserved for count data, in that probabilities can be defined for particular outcomes.
The probability function that has been displayed is a very special case of the more general case, which is called the binomial probability distribution. For measurement data which are considered continuous, the term probability density is used. For example, consider a spinner wheel which conceptually can be thought of as being marked off on the circumference infinitely precisely from 0 up to, but not including, 1. In spinning the wheel, the probability of the wheel’s stopping at a specified marking point at any particular x value, where 0 £ x < 1, is zero, for example, stopping at the value x = Ï.w5.
For the spinning wheel, the probability density function would be defined by f(x) = 1 for 0 £ x < 1. Graphically, this is shown in Figure. The relative-probability concept refers to the fact that density reflects the relative likelihood of occurrence; in this case, each number between 0 and 1 is equally likely.
For measurement data, probability is defined by the area under the curve between specified limits. A density function always must have a total area of 1.
Example For the density as figure that
P[0 £ x £ .4] = .4
P[.2 £ x £ .9] = ..7
P[.6 £ x < 1] = ..4
and so forth. Since the probability associated with any particular point value is zero, it makes no difference whether the limit point is defined by a closed interval (£ or ³) or an open interval (< or >).
Many different types of models are used as the foundation for statistical analysis. These models are also referred to as populations. Parameters As a way of characterizing probability functions and densities, certain types of quantities called parameters can be defined.
For example, the center of gravity of the distribution is defined to be the population mean, which is designated as m. For the coin toss m = .5, which corresponds to the average value of x; i.e., for half of the time x will take on a value 0 and for the other half a value 1. The average would be ..5. For the spinning wheel, the average value would also be .5.
These are:
1. Type of data
2. Random variables
3. Models
4. Parameters
5. Sample statistics
6. Characterization of chance occurrences
From these can be developed strategies and procedures for dealing with (1) estimation and (2) inferential statistics. The following has been directed more toward inferential statistics because of its broader utility.
Detailed illustrations and examples are used throughout to develop basic statistical methodology for dealing with a broad area of applications. However, in addition to this material, there are many specialized topics as well as some very subtle areas which have not been discussed. The references should be used for more detailed information.
Type of Data In general, statistics deals with two types of data: counts and measurements. Counts represent the number of discrete outcomes, such as the number of defective parts in a shipment, the number of lost-time accidents, and so forth. Measurement data are treated as a continuum. For example, the tensile strength of a synthetic yarn theoretically could be measured to any degree of precision.
A subtle aspect associated with count and measurement data is that some types of count data can be dealt with through the application of techniques which have been developed for measurement data alone. This ability is due to the fact that some simplified measurement statistics serve as an excellent approximation for the more tedious count statistics.
Random Variables Applied statistics deals with quantitative data. In tossing a fair coin the successive outcomes would tend to be different, with heads and tails occurring randomly over a period of time. Given a long strand of synthetic fiber, the tensile strength of successive samples would tend to vary significantly from sample to sample.
Counts and measurements are characterized as random variables, that is, observations which are susceptible to chance. Virtually all quantitative data are susceptible to chance in one way or another. Models Part of the foundation of statistics consists of the mathematical models which characterize an experiment.
The models themselves are mathematical ways of describing the probability, or relative likelihood, of observing specified values of random variables. For example, in tossing a coin once, a random variable x could be defined by assigning to x the value 1 for a head and 0 for a tail. Given a fair coin, the probability of observing a head on a toss would be a .5, and similarly for a tail. Therefore, the mathematical model governing this experiment can be written as in the table below:
x | P(x) |
0 | 0.5 |
1 | 0.5 |
where P(x) stands for what is called a probability function. This term is reserved for count data, in that probabilities can be defined for particular outcomes.
The probability function that has been displayed is a very special case of the more general case, which is called the binomial probability distribution. For measurement data which are considered continuous, the term probability density is used. For example, consider a spinner wheel which conceptually can be thought of as being marked off on the circumference infinitely precisely from 0 up to, but not including, 1. In spinning the wheel, the probability of the wheel’s stopping at a specified marking point at any particular x value, where 0 £ x < 1, is zero, for example, stopping at the value x = Ï.w5.
For the spinning wheel, the probability density function would be defined by f(x) = 1 for 0 £ x < 1. Graphically, this is shown in Figure. The relative-probability concept refers to the fact that density reflects the relative likelihood of occurrence; in this case, each number between 0 and 1 is equally likely.
For measurement data, probability is defined by the area under the curve between specified limits. A density function always must have a total area of 1.
Example For the density as figure that
P[0 £ x £ .4] = .4
P[.2 £ x £ .9] = ..7
P[.6 £ x < 1] = ..4
and so forth. Since the probability associated with any particular point value is zero, it makes no difference whether the limit point is defined by a closed interval (£ or ³) or an open interval (< or >).
Many different types of models are used as the foundation for statistical analysis. These models are also referred to as populations. Parameters As a way of characterizing probability functions and densities, certain types of quantities called parameters can be defined.
For example, the center of gravity of the distribution is defined to be the population mean, which is designated as m. For the coin toss m = .5, which corresponds to the average value of x; i.e., for half of the time x will take on a value 0 and for the other half a value 1. The average would be ..5. For the spinning wheel, the average value would also be .5.
Comments