Wednesday, July 5

Statistics for Economists (804) - Spring 2023 - Assignment 1

Statistics for Economists (804)

Q.1         (a)          Differentiate between induction and deduction. Why do we use sample instead of population?   

               

                (b)          Explain the term deviation: measures of spread in data/ and explain the

                                following term:

Mean Absolute Deviation   ii)   Mean  Squared  Deviation   (iii)  Variance

(iv)         Standard Deviation       

(a) Differentiation between induction and deduction:

Induction and deduction are two distinct logical reasoning methods used in statistics and other fields of study.

Dear Student,

Ye sample assignment h. Ye bilkul copy paste h jo dusre student k pass b available h. Agr ap ne university assignment send krni h to UNIQUE assignment hasil krne k lye ham c contact kren:

0313-6483019

0334-6483019

0343-6244948

University c related har news c update rehne k lye hamra channel subscribe kren:

AIOU Hub

Induction refers to the process of drawing general conclusions or making predictions based on specific observations or examples. It involves reasoning from specific instances to a general principle or hypothesis. Inductive reasoning is characterized by moving from specific observations to broader generalizations. It is often used in empirical research to derive hypotheses from observed patterns or trends in the data.

On the other hand, deduction involves the process of deriving specific conclusions from general principles or assumptions. It starts with a general premise or theory and uses logical reasoning to arrive at specific conclusions. Deductive reasoning is characterized by moving from general principles to specific instances. It is commonly used to test hypotheses and make predictions based on established theories or assumptions.

The choice between induction and deduction depends on the nature of the research question and the available data. Inductive reasoning is often employed when the goal is to generate new hypotheses or theories based on observed patterns. It is useful in exploratory research or when there is limited prior knowledge on the subject. Deductive reasoning, on the other hand, is employed when the goal is to test existing theories or hypotheses by deriving specific predictions that can be tested empirically.

(b) Measures of spread and their explanations:

(i) Mean Absolute Deviation (MAD): The mean absolute deviation is a measure of spread that quantifies the average distance between each data point and the mean of the dataset. It is calculated by taking the absolute difference between each data point and the mean, summing these differences, and dividing by the total number of data points. MAD provides an understanding of how dispersed the data points are around the mean and is less sensitive to extreme values compared to other measures of spread.

(ii) Mean Squared Deviation (MSD): The mean squared deviation is another measure of spread that quantifies the average squared distance between each data point and the mean of the dataset. It is calculated by squaring the difference between each data point and the mean, summing these squared differences, and dividing by the total number of data points. MSD gives more weight to extreme values compared to MAD since the differences are squared.

(iii) Variance: Variance is a measure of spread that represents the average of the squared deviations from the mean. It is calculated by taking the mean of the squared differences between each data point and the mean of the dataset. Variance provides a measure of the dispersion of the data points around the mean and is widely used in statistical analysis.

(iv) Standard Deviation: The standard deviation is the square root of the variance and is often preferred as a measure of spread due to its interpretability. It represents the average distance between each data point and the mean, similar to MAD. Standard deviation is widely used in statistical analysis as it provides a measure of the spread of data in the same units as the original data.

Overall, these measures of spread help to understand the variability or dispersion of the data points in a dataset, providing insights into the overall spread or concentration of the values.

 

Q.2         (1)          Two cards are drawn from an ordinary deck. What is the probability that:

                (a)          They are both aces given red       (b) black 6 and red aces

                (c)           They are both jack red                 (d) Red 5 and black queen.

                (ii) In a family of 8 children, what is the chance of,

                (a) At least I boy?                    (b) At least 6 boys, given at least one girl?   

                (c) At the most 2 boys             (d) At least 3 boys, given youngest is a boy?            

To calculate the probabilities, let's break down each scenario:

(a) They are both aces given red:

In an ordinary deck of cards, there are 52 cards, out of which 26 cards are red. There are 4 aces in the deck, and 2 of them are red. When drawing two cards without replacement, the probability of drawing a red ace on the first draw is 2/52. After removing one red ace from the deck, the probability of drawing the second red ace is 1/51 (since there is now one less card in the deck). To calculate the probability of both events happening, we multiply the probabilities: (2/52) * (1/51) = 1/1326.

(b) Black 6 and red aces:

In an ordinary deck of cards, there are 52 cards. There is only one black 6 and two red aces. When drawing two cards without replacement, the probability of drawing the black 6 on the first draw is 1/52. After removing the black 6 from the deck, the probability of drawing a red ace on the second draw is 2/51 (since there are two red aces remaining in the deck). To calculate the probability of both events happening, we multiply the probabilities: (1/52) * (2/51) = 2/2652.

(c) They are both jack red:

In an ordinary deck of cards, there are 52 cards. There are two red jacks in the deck. When drawing two cards without replacement, the probability of drawing a red jack on the first draw is 2/52. After removing one red jack from the deck, the probability of drawing the second red jack is 1/51 (since there is now one less card in the deck). To calculate the probability of both events happening, we multiply the probabilities: (2/52) * (1/51) = 1/1326.

(d) Red 5 and black queen:

In an ordinary deck of cards, there are 52 cards. There are two red 5s and four black queens in the deck. When drawing two cards without replacement, the probability of drawing a red 5 on the first draw is 2/52. After removing one red 5 from the deck, the probability of drawing the black queen is 4/51 (since there are four black queens remaining in the deck). To calculate the probability of both events happening, we multiply the probabilities: (2/52) * (4/51) = 8/2652.

(ii) In a family of 8 children:

(a) At least 1 boy:

To calculate the probability of having at least 1 boy, we need to consider the complementary probability of having all girls. The probability of having a girl is 1/2, so the probability of having all girls is (1/2)^8 = 1/256. Therefore, the probability of having at least 1 boy is 1 - 1/256 = 255/256.

(b) At least 6 boys, given at least one girl:

To calculate the probability of having at least 6 boys given that there is at least one girl, we need to consider all possible combinations of boys and girls. The probability of having 6 boys and 2 girls is (1/2)^8 = 1/256. Similarly, the probability of having 7 boys and 1 girl is (1/2)^8 = 1/256. Finally, the probability of having all 8 boys is (1/2)^8 = 1/256. Adding these probabilities together, we get 1/256 + 1/256 + 1/256 = 3/256.

(c) At most 2 boys:

To calculate the probability of having at most 2 boys, we need to consider all possible combinations of boys and girls with 0, 1, or 2 boys. The probability of having all girls is (1/2)^8 = 1/256. The probability of having 1 boy and 7 girls is 8 * (1/2)^8 = 8/256. The probability of having 2 boys and 6 girls is (8 choose 2) * (1/2)^8 = 28/256. Adding these probabilities together, we get 1/256 + 8/256 + 28/256 = 37/256.

(d) At least 3 boys, given the youngest is a boy:

Given that the youngest child is a boy, we only need to consider the remaining 7 children. The probability of having at least 3 boys out of 7 is the sum of the probabilities of having 3, 4, 5, 6, 7 boys out of 7. Using the binomial probability formula, we can calculate each individual probability and add them together.

Overall, the probabilities depend on the assumptions made and the specific context of the problem. The calculations provided here are based on the assumptions of an ordinary deck of cards and an equal probability of having a boy or a girl in the family.

 

Q.3         Describe the joint probability function with the help of a suitable example, Suppose X and Y have the following joint distribution:

 

x

y

0

1

2

0

 

.7

.4

.6

1

.1

.2

.3

 

(a) Find p(x) and p(y) then by verifying that p(x) * p(y) = p(x, y) confirm that X and Y are independent.

(b) What is σXY ?            

To describe the joint probability function, let's consider the example with random variables X and Y and their joint distribution as given:

x              y              0              1              2

0                              .7            .4            .6

1                              .1            .2            .3

(a) Finding p(x) and p(y) and verifying independence:

To find the marginal probabilities p(x) and p(y), we sum the probabilities of each value across the corresponding row or column.

 

For p(x):

p(0) = .7 + .4 + .6 = 1.7

p(1) = .1 + .2 + .3 = 0.6

 

For p(y):

p(0) = .7 + .1 = 0.8

p(1) = .4 + .2 = 0.6

p(2) = .6 + .3 = 0.9

 

Next, we check if X and Y are independent by verifying if p(x) * p(y) = p(x, y) for all values of x and y.

 

For x = 0 and y = 0:

p(x) * p(y) = p(0) * p(0) = 1.7 * 0.8 = 1.36

p(x, y) = 0.7

Since p(x) * p(y) = p(x, y) holds true for this combination, X and Y are independent for x = 0 and y = 0.Similarly, we can calculate for other combinations and find that p(x) * p(y) = p(x, y) for all values of x and y. Therefore, we can confirm that X and Y are independent based on this joint distribution.

(b) Calculating σXY (Covariance):

The covariance (σXY) measures the extent to which X and Y vary together. It is calculated as the sum of the products of the differences from the means of X and Y, weighted by their respective probabilities.

To calculate σXY, we first need to find the means of X and Y:

 

Mean of X (μX) = (0 * 0.7 + 1 * 0.1 + 2 * 0.2) = 0.5

Mean of Y (μY) = (0 * 0.7 + 1 * 0.4 + 2 * 0.6) = 1.2

 

Next, we calculate the covariance using the formula:

 

σXY = ∑(x - μX)(y - μY)p(x, y)

 

σXY = (0 - 0.5)(0 - 1.2)(0.7) + (0 - 0.5)(1 - 1.2)(0.4) + (0 - 0.5)(2 - 1.2)(0.6)

       + (1 - 0.5)(0 - 1.2)(0.1) + (1 - 0.5)(1 - 1.2)(0.2) + (1 - 0.5)(2 - 1.2)(0.3)

 

After performing the calculations, we find that σXY is equal to -0.06.

Therefore, the covariance (σXY) for the given joint distribution is -0.06.

Q. 4        (a) Describe the central limit theorem with an example.              

 

 (b)         The weights of packages filled by a machine are normally distributed about a mean of 25 ounces, with a standard deviation of one ounce. What is the probability that n packages from the machine will have an average weight of less than 24 ounces if n = 1, 4, 16, 64?            

(a) The Central Limit Theorem (CLT) states that when independent random variables are summed or averaged, regardless of the shape of their original distribution, the resulting distribution will approximate a normal distribution as the sample size increases. This is true even if the original variables themselves are not normally distributed.

The CLT has three main principles:

1. Independence: The random variables being averaged or summed should be independent of each other.

2. Sample Size: As the sample size increases, the distribution of the sample mean or sum approaches a normal distribution.

3. Finite Variance: The original random variables should have finite variance.

Example of the Central Limit Theorem:

Let's consider an example of rolling a fair six-sided die repeatedly and calculating the mean of each set of rolls. Each roll of the die is an independent random variable with a discrete uniform distribution. As we increase the number of rolls, the distribution of the sample means approaches a normal distribution.

Suppose we roll the die 10 times and calculate the mean of each set of 10 rolls. We repeat this process many times and record the means. As the number of repetitions increases, the distribution of the sample means becomes approximately normal, regardless of the fact that the original distribution (uniform) is not normal.

(b) Probability calculation for average weight:

Given that the weights of packages filled by a machine are normally distributed with a mean of 25 ounces and a standard deviation of 1 ounce, we can use the properties of the normal distribution to calculate the probabilities.

To calculate the probability that n packages from the machine will have an average weight of less than 24 ounces for different values of n, we can use the concept of the sampling distribution of the mean.

The mean of the sampling distribution of the mean (μx̄) is equal to the population mean, which is 25 ounces in this case. The standard deviation of the sampling distribution of the mean (σx̄) is equal to the population standard deviation divided by the square root of the sample size, which is 1/sqrt(n) ounces.

For n = 1:

We have a single package, and the average weight of that package will be equal to the weight of that package. The probability that the weight of the package is less than 24 ounces can be calculated using the standard normal distribution (Z-score). We can convert the value 24 to a Z-score using the formula: Z = (x - μ) / σ, where x is the value, μ is the mean, and σ is the standard deviation.

Z = (24 - 25) / (1 / sqrt(1))

Z = -1

Looking up the Z-score of -1 in the standard normal distribution table, we find that the probability of having a Z-score less than -1 is approximately 0.1587.

For n = 4, 16, 64:

In these cases, we are considering the average weight of multiple packages. The probability that the average weight of the n packages is less than 24 ounces can also be calculated using the standard normal distribution.

Z = (24 - 25) / (1 / sqrt(n))

For n = 4:

Z = -2

The probability of having a Z-score less than -2 is approximately 0.0228.

For n = 16:

Z = -4

The probability of having a Z-score less than -4 is approximately 0.00003.

For n = 64 :Z = -8

The probability of having a Z-score less than -8 is approximately 0.

Therefore, the probability that n packages from the machine will have an average weight of less than 24 ounces is approximately:

- For n = 1: 0.1587

 For n = 4: 0.0228

- For n = 16: 0.00003

- For n = 64: 0

Please note that these probabilities are approximate and have been rounded for simplicity.

 

Q. 5.       On a certain Tuesday evening, a check was made of five different computer rooms in campus residence units. The number of students using computers in the five units was 100, 160, 340, 270, and 210, respectively.

(a)           Find the average number of users per room.

(b)          Find the variance of this sample distribution.

To calculate the average number of users per room, you need to sum up the number of users in all the rooms and divide it by the total number of rooms. In this case, there are five rooms.

(a) Average number of users per room:

Total number of users = 100 + 160 + 340 + 270 + 210 = 1080

Number of rooms = 5

Average number of users per room = Total number of users / Number of rooms

                                 = 1080 / 5

                                 = 216

Therefore, the average number of users per room is 216.

To find the variance of this sample distribution, you need to follow these steps:

1. Calculate the mean (average) of the sample distribution. In this case, we have already calculated it as 216.

2. Subtract the mean from each individual value and square the result.

3. Sum up all the squared differences obtained in step 2.

4. Divide the sum obtained in step 3 by the total number of samples minus 1 (in this case, 5 - 1 = 4).

(b) Variance of the sample distribution:

Room 1: (100 - 216)^2 = 11664

Room 2: (160 - 216)^2 = 3136

Room 3: (340 - 216)^2 = 38416

Room 4: (270 - 216)^2 = 2916

Room 5: (210 - 216)^2 = 36

Sum of squared differences = 11664 + 3136 + 38416 + 2916 + 36 = 56068

Variance = Sum of squared differences / (Number of samples - 1)

         = 56068 / 4

         = 14017

Therefore, the variance of this sample distribution is 14017.

Please note that these calculations are based on the assumption that the given data represents a sample and not the entire population.

Dear Student,

Ye sample assignment h. Ye bilkul copy paste h jo dusre student k pass b available h. Agr ap ne university assignment send krni h to UNIQUE assignment hasil krne k lye ham c contact kren:

0313-6483019

0334-6483019

0343-6244948

University c related har news c update rehne k lye hamra channel subscribe kren:

AIOU Hub