Hypothesis testing is a statistical procedure for deciding between two possible statements about a population parameter. For example
the average marks of boys and girls in mathematics are equal
the average marks of boys is more than that of girls
The hypothesis testing is also called significance testing.
What is hypothesis?
A hypothesis is a specific statement about population parameter. It is is a proposed explanation or prediction for a phenomenon, formulated based on prior knowledge, observations, and logical reasoning. It is testable and can be either supported or refuted through experimentation and data analysis. We generally use two types of hypotheses.
Research hypothesis
Research hypothesis is tentative statement about expected outcomes for the variables in a study. One example of research hypothesis can be: “Technology integrated pedagogy enhance student’s learning over traditional pedagogy”.
Statistical hypothesis
Statistical hypothesis is a statement about population parameter. One example of statistical hypothesis can be: “The mean achievement of fifth graders taught by technology integrated pedagogy is greater than the mean achievement taught by traditional pedagogy”. Here mean is the statistics. We generalize statistical hypothesis in two ways:
Null hypothesis
Alternative hypothesis
What is difference between Research hypothesis and Statistical hypothesis?
The basic difference is
Research hypothesis: Expressed in a narrative form
Statistical hypothesis: Expressed in mathematical or statistical terms
Statistical hypothesis without any direction is called null hypothesis. It is also called non-directional hypothesis. It is denoted by \( H_0\).
The null hypothesis generally states that there is no difference between population population parameters (or no difference between population parameter and sample statistic).
For example, if we are testing hypothesis: Technology integrated pedagogy enhance student’s mean achievement over traditional pedagogy, then the null hypothesis is:
\( H_0 =\mu_1=\mu_2 \)
student's mean achievement taught by technology integrated pedagogy and traditional pedagogy are equal
where
\( \mu_1 \) =mean achievement taught by technology integrated pedagogy
\( \mu_2 \) = mean achievement taught by traditional pedagogy
What is Alternative hypothesis?
Statistical hypothesis that expresses remaining (opposite) outcomes from null hypothesis, is called alternative hypothesis. Alternative hypothesis is also called directional hypothesis and it is denoted by \( H_1\) .
It is a hypothesis where the claim stands.
If we are testing hypothesis: Technology integrated pedagogy enhance student’s mean achievement over traditional pedagogy, then alternative hypothesis can be designed as follows:
\( H_1 =\mu_1 >\mu_2 \)
where
\( \mu_1 \) = mean achievement taught by technology integrated pedagogy \( \mu_2 \) = mean achievement taught by traditional pedagogy.
What is the types of Alternative hypothesis?
Bases on given question, the alternative hypothesis can be designed in three different ways.
For exmample,
If we are testing hypothesis on student's mean achievement taught by technology integrated pedagogy over traditional pedagogy, then three types of alternative hypothesis can be designed as follows:
Two tailed \( H_1 :\mu_1 \neq \mu_2 \)
student's mean achievement taught by technology integrated pedagogy and traditional pedagogy are NOT equal
Left tailed \( H_1 :\mu_1 < \mu_2 \)
student's mean achievement taught by technology integrated pedagogy is less than the traditional pedagogy
Right tailed \( H_1 :\mu_1 > \mu_2 \)
student's mean achievement taught by technology integrated pedagogy is greater than traditional pedagogy
In hypothesis test, alternative hypothesis can be one-tailed or two-tailed. When rejection region is taken on both ends of sampling distribution, then the test is called two-sided test or two-tailed test. When performing a two-tail hypothesis test,
alternative hypothesis take "≠, not equal sign". Therefore, we reject null hypothesis if sample statistics falls in either tail of the distribution. For this reason, the alpha level (α ) is split across the two tails. So, each tail has a probability of
α/2 .
What is One tailed Test?
In hypothesis testing, if rejection region is taken on only one ends of sampling distribution then it is called one-tailed test or one-sided test. When performing a single-tail hypothesis test, alternative hypothesis take symbols of greater than or less
than (> or <). Here, we classify one tailed test into two categories
Left tailed test (<)
we reject null hypothesis if sample statistics falls in left tail of the distribution. For this reason, the alpha level (α ) is split across the left tails.
Right tailed test (>)
we reject null hypothesis if sample statistics falls in right tail of the distribution. For this reason, the alpha level (α ) is split across the right tails.
In hypothesis testing, we are not hundred percent sure if the decisions is true or not, because it is based on probability theory. So, we set some chance of committing error. This magnitude of error is known as significance level or level of significance. The level of significance is denoted
by α .
Level of significance is the probability of rejecting null hypothesis given that it is true (type I error).
The commonly used level of significance is educational research are 1%, 5% and 10%, among them 5% is the most common.
5% level of significance means of there is a probability of 5% error. It (5% level of significance) represents that sample statistics may not capture (by chance of error) population parameter at most 5 times out of 100.
If α= 0.05 (or 5%) then we expect to obtain a sample statistic that falls in the critical region at most 5% of the time.
What is type of error?
In hypothesis testing, there are four possible outcomes (two correct and two incorrect). These possible outcomes are as follows:
The null hypothesis is true and the test concludes that it is true. This is situation of correct decision.
The null hypothesis is false, and the test concludes that it is false. This is also a case of correct decision.
The null hypothesis is true and the test concludes that it is false. This is not a correct decision and researcher commits an error called α-error or simply First Type of Error or Type I Error.
The null hypothesis is false and the test concludes that it is true. This is again incorrect decision and researcher commits an error called β-error or simply Second Type error or Type II Error.
Critical values are the values that separates rejection (critical) region and acceptance region. It is also called significance value. This value depends upon
Level of significance
Alternative hypothesis, whether it is two tailed or one tailed test.
Test Statistic
The figure for critical values are as follows
Critical Value Two tailed test (≠)
Critical Value Left tailed test (<)
Critical Value Right tailed test (>)
What is Critical Region?
Critical region(s) is the area at the 'tails' of the distribution that indicate to reject the null hypothesis. In other words, critical region is amount of distribution at ends that lead to reject \(H_0\).
The P-value is probability value, so it lies between 0 and 1. It is amount of probability rejecting H0 when H0 is true.
A low P value indicates that sample provides enough evidence to reject the null hypothesis. If the p-value is less than (or equal to)
α, then the null hypothesis is rejected .
A high P value indicates that high degree of error. If the p-value is greater than α, then the null hypothesis is not rejected.
Hypothesis testing is a statistical procedure to decide whether the information in the sample is consistent, or inconsistent, with the null hypothesis about the population parameter. There are two commonly used approach in hypothesis testing. One is critical value approach, and other is probability value (p-value) approach.
Critical-value approach
The five basic steps for critical value approach in hypothesis test:
Formulate H0 and H1 and Specify α
Determine test statistics
Determine critical region
Calculate test statistic
Decide about hypothesis (Reject H0 if (4) falls in critical region)
If value of test statistic fall into critical region (or if it is equal to critical value), we reject H0. Otherwise we cannot reject H0.
p- value approach
The test of hypothesis can be done by p-value approach. For this we follow following steps.
Formulate H0 and H1 and
Specify α
Determine test statistics
Calculate test statistic
Compute P-value
Decide the hypothesis (Reject H0 if p-value \( \leq \) α)
Let \( \overline{X} \) is distribution mean concerning a random sample of size \(n\) from a normal population with mean \( \mu \) and variance \( \sigma^2\) then \( Z=\frac{\overline{X}-\mu}{\frac{\sigma}{\sqrt{n}}}\) is a standard normal distribution. The value of Z which gives an area \(\frac{\alpha}{2}\) on the left tail is denoted by \(-Z_{\frac{\alpha}{2}}
\) and that on the right tail is denoted by \(-Z_{\frac{\alpha}{2}} \). The area enclosed between \(-Z_{\frac{\alpha}{2}} \) and \(Z_{\frac{\alpha}{2}} \) is called \((1-\alpha)\)100% confidence region. Hence the probability statement for
the degree of confidence \((1-\alpha)\) 100% is
\(P \left[-Z_{\frac{\alpha}{2}} \le Z \le Z_{\frac{\alpha}{2}} \right] =(1-\alpha)\) or
\(P \left[-Z_{\frac{\alpha}{2}} \le \frac{\overline{X}-\mu}{\frac{\sigma}{\sqrt{n}}} \le Z_{\frac{\alpha}{2}} \right] =(1-\alpha)\) or
\(P \left[-Z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}} \le \overline{X}-\mu \le Z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}} \right] =(1-\alpha)\) or
\(P \left[\overline{X}-Z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}} \le \mu \le \overline{X}+Z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}} \right] =(1-\alpha)\) Hence, the \((1-\alpha)\)100% confidence interval for \( \mu \) with known
variance \( \sigma\) is
\( \overline{X}-Z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}} \le \mu \le \overline{X}+Z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}} \)
Matrix Showing Confidence Interval Estimate of Mean
\( z=\frac{\overline X -\mu}{\frac{\sigma}{\sqrt{n}}} \) If population variance is known
\( z=\frac{\overline X -\mu}{\frac{s}{\sqrt{n}}} \)If population variance is unknown, and sample is large
\( t=\frac{\overline X -\mu}{\frac{s}{\sqrt{n}}} \) with \( n-1 \) df If population variance is unknown, and sample is small
Example 1
A random sample of size 16 has mean 40. If the sample is drawn from a normal population with variance 256, test the hypothesis \( \mu =45 \) at 0.01 at level of significance. Solution
Given that \( n=16, \overline X =40, \sigma^2 = 256, \alpha = 0.01 \) Now
Since, population variance is known, we use Z-statistic Thus, \( Z_{\frac{\alpha}{2}}=Z_{\frac{0.01}{2}}=Z_{0.005} =2.57 \) The left critical value is -2.57 The right critical value is 2.57
On the basis of sample data, value of test statistic is \( z=\frac{\overline X -\mu}{\frac{\sigma}{\sqrt{n}}} \) or
\( z=\frac{40 -45}{\frac{16}{\sqrt{16}}} \) or
\( z=\frac{-5}{\frac{16}{4}} \) or
\( z=\frac{-5}{4} \) or
\( z=-1.25 \)
Here, \( z=-1.25 \) does not lie in the critical region, so \( H_0 \) can't be rejected. Interpretation: The test concludes that, population mean is 45.
Exercise: One sample mean
A random sample of 10 boys had following mathematics score in full marks 150. Score: 70, 120, 110, 101, 88, 83, 95, 98, 107,100.
Do these data support the assumption of a population mean score of 100? Test the hypothesis at 0.05 level of significance.
A random sample of 100 recorded deaths in a certain hospital during the past year showed an average life span of 71.8 years with a standard deviation of 8.9 years. Does this seem to indicate that the average life span today is less than 75 years? Test the hypothesis at 0.05 level of significance.
Given a random sample of size 25 from a normal population with variance 256 has mean 48.Test hypothesis that \( \mu=45\) against \( \mu < 45\) at 0.01 against at level of significance.
Suppose, 100 tires made by certain manufacturer lasted on the average 21819 miles with a standard deviation of 1295 miles. Test null hypothesis \( \mu = 22000\) miles against alternative hypothesis \( \mu < 22000\) miles 0.05 at level of significance.
It is known from experience that standard deviation of weight of 8-gm packages of cookies made by a certain bakery company is 0.16 gm. To check whether the production is under control in a day, employees selected a random sample of 25 packages and found their mean weight is 8.091 mg, test the null hypothesis at 0.01 level of significance.
The specifications for a certain kinds of food package is 185 pounds. If 5 pieces randomly selected have weights 190 pounds, test null hypothesis at 0.05 level of significance.
A manufacturer of sports equipment has developed a new synthetic fishing line that they claims has a mean breaking strength of 8 kg with standard deviation of 0.5kg. Test the hypothesis \( \mu = 8\) against \( \mu > 22000\) if a random sample of 50 lines is tested and found mean breaking strength of 8.2 kilograms. Use 0.01 level of significance.
It is known that average student expenses per month in KTM is 10000 with SD 2000. A random sample of 15 students has mean expenses of 12000. Test the hypothesis at 5% level that the expenses these in KTM has increased.
A sample of 900 hair clip has a mean 3.4 cm and standard deviation 2.61 cm. Is the sample from a large population of mean 3.25cm? Test the hypothesis at 95% confidence level.
The mean weekly sales of a soap in a departmental was 146.3 bars per store. After an advertising campaign the mean weekly sales in 22 stores for a typical week increased to 153.7 bars and showed a standard deviation of 17.2. Was the advertising campaign successful? Test the hypothesis at 0.05 level of significance.
A simple random sample of 10 people from a certain population has a mean age 27. Can we conclude that the mean age of the population is less that 30? The population variance is known to be 20. Use = 0.05.
A manufacturer claim that, mean breaking strength of safety belts for air passenger produced in its factory is 1275 kg. A sample of 100 belts was tested and the mean breaking strength and SD were found to be 1258 and 90 kg respectively. Test the claim at 5% level of significance.
The guaranteed average life of a certain kind of electric bulb is 1000hrs with SD 125 hrs. Test this assumption if a random sample of 50 bulbs has average burning 1100hrs at 1% level of significance.
A teacher measured length of 25 pieces of desk that were in a classroom. The resulting data were (in cm): 170, 167, 174, 179, 179, 183, 179, 174, 179, 170,
156, 163, 156, 187, 156, 156, 187, 179, 183, 174,
187, 167, 159 ,170, 179
Test the hypothesis whether average length of the desk are greater than 170 at 0.05 level of significance.
A sample of 25 girls at M. Ed has a mean age 25 years and standard deviation 3 years. Is the sample show that girls are from large population of mean 23 years? Test the hypothesis at 95% confidence level.
An IQ test was given to large group of M. Ed students, who scored an average of 62.5 marks with SD 10. The same test was given to 100 fresh M.ED students, who scored an average of 64.5 with SD 12.5. Can we conclude at 5% level that fresh students have better IQ?
In a secondary level school examination in mathematics, the mean grade of 32 boys was 72 with sd of 8, while the mean grade of 3 girls was 75 with sd of 6. Test the hypothesis at 0.01 level of significance that the girls are better in mathematics than the boys
The mayer of Kathmandu city claimed that the average income of families living in Kathmandu is at least Rs 300000 in a year. A random sample of 100 families selected from Kathmandu produce a mean of 288000 with standard deviation of Rs 80000. Use 5% level of significance, can you conclude that the mayer's claim is true?
Exercise: Two sample mean
In a survey of buying habits, 400 women shoppers are chosen at random in supermarket A. Their average weekly food expenditure is Rs 250 with a standard deviation of Rs 40. Another 400 women shoppers are chosen at random in supermarket B. Their average weekly food expenditure is Rs 220 with a standard deviation of Rs 55. Test at 1% level of significance whether the average weekly food expenditures for two population shoppers are equal.
Solution
Given that \(n_1=400, \bar{X}=250, s_1=40, n_2=400, \bar{X}=220, s_1=55,\alpha=0.01\)
\(H_0: \mu_1 =\mu 2\)
\(H_1: \mu_1 \ne \mu 2\)
\( \alpha =0.01\)
Population variance is unknown, sample size is large, so use z-statistic
\(H_o\) is rejected, so we claim that \(\mu_1 \ne \mu 2\)
It shoes that, the average weekly food expenditures for two population shoppers are NOT equal.
The mean of two samples of 100 and 200 items are 170 and 169 respectively. Can we conclude that the samples are drawn from same population with SD 10? Use 5% level of significance.
[z=0.81, p=0.41]
The mean and standard deviation of a sample of size 16 are 250 and 40 respectively. Those of another sample of size 24 are 220 and 55. Test at 1% level of significance whether the means of the two populations from which the samples have been drawn are equal.
[t=1.995, p=0.053]
A sample of 10 bulbs of brand A gave mean lifetime 1200h with SD 70h. Another 12 sample of brand B gave mean lifetime 1150h with SD 85h. Can we conclude at 5% that brand A bulbs are superior?
Two samples drawn from two different populations gave following results.
size
mean
SD
Sample 1
100
582
24
Sample 2
100
540
28
Test the hypothesis at 1% if the mean difference is 35?
Below are given the gain in weights (in kg) of pigs fed on two diets A and B. Test the hypothesis that if the two diets differ significantly as regard their effects on increase the weight at 0.05 level of significance. Diet A: 25,32,30,34,24,14,32,24,30,31,35,25 Diet B: 44,34,22,10,47,31,40,30,32,35,18,21,35,29,22
The height of 6 randomly chosen boy students is (in inches) are 63, 65, 68, 69, 71, 72
Those of 10 randomly chosen girl students are 61, 62, 65, 66, 69, 69, 70, 71, 72, 73
Discuss if the data suggest that boy students are on the average are taller than girl students. Use 0.05 level of significance.
Test the hypothesis, if, in general, that smokers have greater lung damage than do non-smokers bases on following data samples, which are independent from normally distributed populations with equal variances.
Sample
size
mean
variance
smokers
\(n_1=16\)
\(\bar{X_1}=17.5\)
\(s_1^2=4.47\)
non-smokers
\(n_2=9\)
\(\bar{X_2}=12.4\)
\(s_2^2=4.84\)
It is claimed that resistance of electric wire can be reduced by at least 0.05 ohm by allowing. To test the claim, 25 values obtained for each alloyed wire and standard wire produced the following results.
Mean resistance
Standard deviation
Standard wire
0.136ohm
0.002ohm
Alloyed wire
0.083ohm
0.003ohm
Test at 5% level whether the claim is sustained.
A mathematics test was given to two groups consisting of 40 and 50 students. In the first group, the mean mark was 74 with a standard deviation 8. In the second group, the mean mark was 78 with a standard deviation 7. Is there a significance difference between the performances of the two groups at 0.05 level of significance?
In a certain experimentation to compare two types of animal food A and B, below are given the gain in weights (in kg) Diet A: 49,53,51,52,47,50,52,53 Diet B : 52,55,52,,53,50,54,,54,53
Assuming that two samples of animals are independent, can we conclude that food B is better than food A? Test the hypothesis at 0.05 level of significance.
We meant to compare two kinds of electric bulbs based on the following burning times: Brand A :14.9 ,11.3, 13.2, 16.6, 17.0, 14.1, 15.4, 13.0, 16.9 Brand B :15.2 ,19.8, 14.7, 18.3, 16.2, 21.2, 18.9, 12.2, 15.3, 19.4
Test at 0.05 level of significance whether the average burning time of Brand A is less than that of Brand B.
A psychological study was conducted to compare the reaction times of men and women to a certain stimulus. Independent random samples of 50 men and 50 women were employed in an experiment. The mean reaction time for men was 3.6 second with a variance of 0.18, while the mean reaction time for women was 3.8 seconds with a variance of 0.14. Is there a significance difference between the mean reaction times of men and women at 0.05 level of significance?
The followings are the mileage per gallon obtained from two kinds of gasoline. Gasoline A: 17, 17.8, 15.2, 16.8, 18.4, 16.2, 18.3, 18.1, 14.3 Gasoline B: 18.6, 18.8, 17.1, 19.5, 17.6, 19, 15.7, 19.8, 17.5, 18
Test at 0.05 level of significance that the average mileage of gasoline A is less than that of the gasoline B.
Two groups each made up 16 students, were match for teaching mathematics based on IQs. The discovery method was used in experimental group and conventional method was used in control. At the end of semester, the same test was given to two groups. The result was found as follows: \(s_1^2=64,n_1=16, \bar{X_1}=46\): experimental group \(s_2^2=49,n_2=16, \bar{X_2}=42\): control group
Is there a significant mean difference between the mean achievements of two groups at 0.05 level of significance?
An educator wishes to determine whether any significant gain in knowledge of mathematics has occurred for a group of 100 pupils following a special summer outdoor programme. He administers a pre-test and a post-test covering subject matter before and after the summer programme. The result was found as follows: \(s_1^2=64,n_1=100, \bar{X_1}=46\): post-test \(s_2^2=49,n_2=100, \bar{X_2}=42\): pre-test
Is there a significant mean difference between post-test and pre-test result at 0.01 level of significance?
Samples of two types of electric lights bulbs were tested for length of life and following data were obtained. Is the difference in means sufficient to warrant that type I is superior to type II regarding length of life? Test the hypothesis at 0.05 level of significance.
No comments:
Post a Comment