Statistics-I (394)
Q. 1: a) Define
descriptive and inferential statistics and differentiate between them.
b) Define the following terms:
i) Population and sample ii)
Parameter and statistic
ii) Quantitative variable iv) Qualitative variable
Certainly, I'll provide a detailed explanation of the topics
you mentioned. Due to the text limit, I won't be able to provide a 1000-word
response in one go. Let's start with the first part:
Dear Student,
Ye sample assignment h. Ye bilkul
copy paste h jo dusre student k pass b available h. Agr ap ne university
assignment send krni h to UNIQUE assignment
hasil krne k lye ham c contact kren:
0313-6483019
0334-6483019
0343-6244948
University c related har news c
update rehne k lye hamra channel subscribe kren:
Descriptive and
Inferential Statistics
Descriptive Statistics:
Descriptive statistics involves the organization, analysis,
and presentation of data to provide a summary or description of its main
features. It helps in simplifying large amounts of data in a meaningful way.
Common measures in descriptive statistics include measures of central tendency
(mean, median, mode), measures of variability (range, variance, standard
deviation), and measures of distribution (skewness, kurtosis).
For example, if we have a dataset of the ages of a group of
people, descriptive statistics would help us understand the typical age (mean),
the age at which most people fall (mode), and how spread out the ages are
(standard deviation).
#### Inferential
Statistics:
Inferential statistics involves using data from a sample to
make inferences or draw conclusions about a population. It uses probability
theory to make predictions or generalizations about a larger group based on a
smaller subset of that group. Inferential statistics includes hypothesis
testing, regression analysis, and analysis of variance.
Continuing with the age example, inferential statistics
might involve using the ages of a sample of people to make predictions or
inferences about the ages of the entire population from which the sample was
drawn.
**Differentiation
between Descriptive and Inferential Statistics:**
- **Purpose:**
- Descriptive statistics aim to summarize and describe the
main features of a dataset.
- Inferential statistics make inferences and predictions
about a population based on a sample of that population.
- **Example:**
- Descriptive: Calculating
the average age of a group of students.
- Inferential: Using the average age of a sample to make
predictions about the average age of all students in a school.
- **Data
Representation:**
- Descriptive statistics use charts, graphs, and summary
measures.
- Inferential statistics involve probability distributions
and confidence intervals.
Now, let's move on to the second part of your question.
### Definitions:
#### i) Population
and Sample:
- **Population:**
- The population is the entire group that is the subject of
the study.
- Example: All students in a university.
- **Sample:**
- A sample is a subset of the population.
- Example: A group of 100 students selected from the entire
university.
#### ii) Parameter
and Statistic:
- **Parameter:**
- A parameter is a numerical value that describes a
characteristic of a population.
- Example: The
average income of all households in a city.
- **Statistic:**
- A statistic is a numerical value that describes a
characteristic of a sample.
- Example: The
average income of a sample of 100 households in a city.
#### iii)
Quantitative Variable:
- **Quantitative
Variable:**
- A quantitative variable is a type of variable that takes
numerical values and represents some kind of measurement.
- Example: Height, weight, income.
#### iv) Qualitative
Variable:
- **Qualitative
Variable:**
- A qualitative variable is a type of variable that
represents categories or labels.
- Example: Gender, color, marital status.
This provides an overview of the terms you asked about. If
you'd like more details or examples, feel free to let me know!
Q. 2: a) Write
down the important points for drawing graphs.
b) Write is a frequency distribution?
How is it constructed?
c) Give the merits and demerits of
arithmetic mean and median. (6+7+7)
Certainly, let's
delve into each part of your question:
### a) Important
Points for Drawing Graphs:
Drawing graphs is an essential aspect of data analysis,
providing visual representation for better understanding. Here are key points
for drawing graphs:
1. **Selecting the
Right Type of Graph:**
- Choose a graph type that suits the data and the message
you want to convey. Common types include bar graphs, line graphs, scatter
plots, and pie charts.
2. **Labeling Axes:**
- Clearly label the x-axis and y-axis with appropriate
variable names. Include units of measurement when applicable.
. **Choosing Appropriate Scale:**
- Select a suitable scale for each axis to ensure that the
data fits well within the graph, avoiding crowding or excessive white space.
4. **Title and
Legend:**
- Provide a clear and concise title that summarizes the main
point of the graph. Include a legend if the graph includes multiple data
series.
5. **Color and
Style:**
- Use colors and styles thoughtfully to enhance clarity.
Ensure that colors are distinguishable for those with color vision
deficiencies.
6. **Data Accuracy:**
- Double-check data points to ensure accuracy. Mistakes in
data entry can lead to misleading graphs.
7. **Consistency:**
- Maintain consistency in formatting throughout the graph,
such as bar widths or line styles. This aids in clarity and interpretation.
8. **Highlighting Key
Points:**
- Emphasize important data points or trends using
annotations, arrows, or other visual cues.
9. **Data Source:**
- Include a note about the source of the data to establish
credibility and transparency.
10. **Audience
Consideration:**
- Consider the audience when designing graphs. Ensure that
the graph is understandable to both experts and non-experts.
### b) Frequency
Distribution:
A frequency distribution is a table that displays the
distribution of a set of data. It shows the number of observations falling into
different intervals or categories. The construction involves several steps:
1. **Data
Collection:**
- Gather the raw data that you want to analyze.
2. **Determine the
Number of Classes:**
- Decide on the number of intervals or classes. Too few
classes may oversimplify, while too many can obscure patterns.
3. **Calculate the
Range:**
- Find the range of the data (difference between the maximum
and minimum values).
4. **Calculate Class
Width:**
- Determine the width of each class interval by dividing the
range by the number of classes. Round up to ensure all data points are
included.
5. **Set up the
Classes:**
- Establish the intervals using the class width. The classes
should be mutually exclusive and exhaustive, covering the entire range of data.
6. **Tally and
Count:**
- Tally the number of observations falling into each class
interval.
7. **Create Frequency
Table:**
- Construct a table with columns for classes and their
respective frequencies.
8. **Calculate
Cumulative Frequency:**
- Optionally, add a column for cumulative frequency, which
represents the total frequency up to a given class.
### c) Merits and
Demerits of Arithmetic Mean and Median:
#### Arithmetic Mean:
**Merits:**
1. **Sensitive to all
Values:**
- The mean considers all values in the dataset, making it
sensitive to changes in any observation.
2. **Balancing
Property:**
- The sum of deviations above the mean equals the sum of
deviations below the mean, maintaining balance.
3. **Useful in
Statistical Analysis:**
- The mean is often used in statistical analysis and various
mathematical calculations.
**Demerits:**
1. **Affected by
Extreme Values:**
- Outliers or extreme values can significantly impact the
mean, making it less representative of the central tendency.
2. **Not Appropriate
for Skewed Distributions:**
- In skewed distributions, the mean may not accurately
reflect the central location, as it is influenced by the skewness.
#### Median:
**Merits:**
1. **Not Sensitive to
Extreme Values:**
- The median is not influenced by extreme values or
outliers, making it a robust measure of central tendency.
2. **Appropriate for
Skewed Distributions:**
- It is suitable for describing the central tendency in
skewed distributions.
3. **Simple to
Understand:**
- The median is easy to understand and calculate, especially
for ordinal or interval data.
**Demerits:**
1. **Less Sensitive
to Small Changes:**
- The median may not reflect small changes in the dataset,
particularly when dealing with a large sample.
2. **Not Utilizing
All Data Points:**
- It does not use all the information in the dataset; it only
considers the middle value(s).
In conclusion, both the mean and median have their merits
and demerits. The choice between them depends on the nature of the data and the
specific goals of the analysis.
Q. 3:a) Define Histogram. Draw a Histogram for the following
frequency distribution:
X 32 37 42 47 52 57 62 67
f 3 17 28 47 54 31 14 4
b) Define measures of location. Explain properties of good
average.
c) Compute the Mean and mode for the following data; (15+5)
Classes 86-90 91-95 96-100 101-105 106-110 111-115
f 6 4 10 6 3 1
### a) Histogram:
**Definition:**
A histogram is a graphical representation of the
distribution of a dataset. It consists of a series of bars, each representing a
range of values, called a class interval. The height of each bar corresponds to
the frequency or relative frequency of the values within that interval.
**Drawing a Histogram
for the Given Frequency Distribution:**
| X | 32 | 37 |
42 | 47
| 52 | 57 | 62 |
67 |
|------|----|----|----|----|----|----|----|----|
| f | 3 | 17 | 28 | 47 | 54 | 31 | 14 | 4 |
1. **Identify Class
Intervals:**
- The class intervals are determined by the given X values.
2. **Draw Axes:**
- Draw horizontal and vertical axes. The horizontal axis
represents the class intervals, and the vertical axis represents frequency.
3. **Draw Bars:**
- For each class interval, draw a bar with a height
corresponding to the frequency of that interval.
![Histogram](https://i.imgur.com/vFC4WGJ.png)
### b) Measures of Location and Properties of Good Average:
**Measures of
Location:**
Measures of location are statistical measures that describe
the position of a single value within a dataset. Common measures include:
1. **Mean (Arithmetic
Average):**
- The sum of all values divided by the number of values.
2. **Median:**
- The middle value in a dataset when it is arranged in
ascending or descending order.
3. **Mode:**
- The value that occurs most frequently in a dataset.
**Properties of Good
Average:**
1. **Uniqueness:**
- The average should be a unique value, providing a
representative measure for the entire dataset.
2. **Sensitivity to
Changes:**
- The average should be sensitive to changes in the dataset,
reflecting shifts in central tendency.
3. **Additivity:**
- The average of a combined dataset should be the sum of the
averages of its parts.
4. **Non-Bias:**
- The average should not be systematically too high or too
low; it should accurately represent the data.
5. **Ease of
Computation:**
- The average should be easy to compute and understand for
practical use.
### c) Mean and Mode
Calculation:
Given Data:
| Classes | 86-90 | 91-95 | 96-100 | 101-105 | 106-110 |
111-115 |
|---------|-------|-------|--------|---------|---------|---------|
| f | 6 | 4
| 10 | 6 | 3
| 1 |
**Mean Calculation:**
\[ \text{Mean} = \frac{\sum (f \times \text{Midpoint})}{\sum
f} \]
\[ \text{Midpoint} = \frac{\text{Lower Bound} + \text{Upper
Bound}}{2} \]
\[ \text{Mean} = \frac{(6 \times 88) + (4 \times 93) + (10
\times 98) + (6 \times 103) + (3 \times 108) + (1 \times 113)}{6+4+10+6+3+1} \]
\[ \text{Mean} = \frac{528 + 372 + 980 + 618 + 324 +
113}{30} \]
\[ \text{Mean} = \frac{2935}{30} \]
\[ \text{Mean} = 97.83 \]
**Mode Calculation:**
The mode is the class interval with the highest frequency.
Here, the class interval with the highest frequency is
\(96-100\).
Therefore, the mode is \(96-100\).
In summary, the mean for the given data is approximately
\(97.83\), and the mode is \(96-100\).
Q. 4. a) Explain
the difference between absolute dispersion and relative dispersion.
b) Compute median and mean deviation from median for the
data given below:
X 6 8 10 12 14 16 18 20 22
f 5 10 18 20 22 14 7 3 1
c) What are moments about mean and about an arbitrary value?
Give the relation between them. (8+6+6)
### a) Absolute
Dispersion vs. Relative Dispersion:
**Absolute
Dispersion:**
Absolute dispersion measures the spread or variability of a
dataset in its original units. It provides information about how much
individual data points differ from the central tendency. Common measures of
absolute dispersion include range, mean deviation, variance, and standard
deviation.
- **Range:** The
difference between the maximum and minimum values.
- **Mean Deviation:**
The average of the absolute differences between each data point and the mean.
- **Variance:**
The average of the squared differences between each data point and the mean.
- **Standard
Deviation:** The square root of the variance.
**Relative
Dispersion:**
Relative dispersion, on the other hand, expresses the spread
of data in terms of a ratio or percentage relative to a central value. It
allows for comparisons between datasets with different units or scales. The
coefficient of variation (CV) is a common measure of relative dispersion.
- **Coefficient of
Variation (CV):** The ratio of the standard deviation to the mean,
expressed as a percentage.
**Difference:**
- **Focus:**
- Absolute dispersion focuses on the spread of data in its
original units.
- Relative dispersion focuses on the spread of data relative
to a central value, allowing for comparison between datasets.
- **Units:**
- Absolute dispersion is expressed in the same units as the
original data.
- Relative dispersion is expressed as a ratio or percentage,
making it unitless and suitable for comparing datasets with different scales.
- **Use Cases:**
- Absolute dispersion is useful for understanding the
variability in the original data.
- Relative dispersion is useful when comparing the
variability of datasets with different means or scales.
### b) Median and
Mean Deviation from Median:
Given Data:
\[ X \quad 6 \quad 8 \quad 10 \quad 12 \quad 14 \quad 16
\quad 18 \quad 20 \quad 22 \]
\[ f \quad 5 \quad 10 \quad 18 \quad 20 \quad 22 \quad 14
\quad 7 \quad 3 \quad 1 \]
**Median
Calculation:**
The median is the middle value when the data is arranged in
ascending or descending order.
- Arrange the data: \(6, 8, 10, 12, 14, 16, 18, 20, 22\).
- The median is the middle value, which is \(14\).
**Mean Deviation from Median Calculation:**
\[ \text{Mean Deviation from Median} = \frac{\sum |X_i -
\text{Median}| \times f_i}{\sum f} \]
\[ \text{Mean Deviation from Median} = \frac{(8 + 6 + 4 + 2
+ 0 + 2 + 4 + 6 + 8) \times (5 + 10 + 18 + 20 + 22 + 14 + 7 + 3 + 1)}{100} \]
\[ \text{Mean Deviation from Median} = \frac{40 \times
100}{100} \]
\[ \text{Mean Deviation from Median} = 40 \]
### c) Moments about
Mean and Arbitrary Value:
**Moments about
Mean:**
Moments about the mean involve raising the difference
between each data point and the mean to a certain power and then calculating
the average. The \(r\)-th moment about the mean is denoted by \(\mu'_r\) and is
calculated as:
\[ \mu'_r = \frac{\sum (X_i - \bar{X})^r \times f_i}{N} \]
where \(r\) is the order of the moment, \(X_i\) is each data
point, \(\bar{X}\) is the mean, \(f_i\) is the frequency of each data point,
and \(N\) is the total number of data points.
**Moments about an
Arbitrary Value:**
Moments about an
arbitrary value involve raising the difference between each data point and the
chosen value to a certain power and then calculating the average. The \(r\)-th
moment about an arbitrary value \(a\) is denoted by \(\mu_r\) and is calculated
as:
\[ \mu_r = \frac{\sum (X_i - a)^r \times f_i}{N} \]
**Relation between
Moments about Mean and Arbitrary Value:**
The \(r\)-th moment
about an arbitrary value \(a\) is related to the \(r\)-th moment about the mean
by the equation:
\[ \mu_r = \mu'_r + r \times (a - \bar{X}) \times \mu'_{r-1}
+ \frac{r \times (r-1)}{2} \times (a - \bar{X})^2 \times \mu'_{r-2} + \ldots \]
In this equation, \(\mu'_r\) is the \(r\)-th moment about
the mean, \(\mu_r\) is the \(r\)-th moment about the arbitrary value \(a\),
\(\bar{X}\) is the mean, \(a\) is the arbitrary value, \(r\) is the order of
the moment, and \(f_i\) is the frequency of each data point.
This relation provides a way to compute moments about an
arbitrary value using moments about the mean and the difference between the
chosen value and the mean.
Q. 5. a) Define
weighted and unweighted index number and explain why weighted
Index numbers are preferred over unweighted
index numbers.
b) Find chain index numbers (using G.M to
average the relatives) for the following data of prices, taking 1970 as the
base year. (8+12)
Commodities Years
1970 1971 1972 1973 1974
A 40 43 45 42 50
B 160 162 165 161 168
C 20 29 52 23 27
D 240 245 247 250 255
### a) Weighted and
Unweighted Index Numbers:
**Definition:**
1. **Unweighted Index
Number:**
- An unweighted index number is a measure that does not take
into account the relative importance of different items in a group. It is a
simple average of the percentage changes in individual items.
\[ \text{Unweighted Index} = \left( \frac{\text{Sum of
Current Year Prices}}{\text{Sum of Base Year Prices}} \right) \times 100 \]
2. **Weighted Index
Number:**
- A weighted index number considers the importance or weight
of each item in the group. It reflects the significance of each item in the
overall index. The weights are often based on the importance of the items in
terms of their contribution to the total.
\[ \text{Weighted Index} = \left( \frac{\sum (W_i \times
P_{i, t})}{\sum (W_i \times P_{i, 0})} \right) \times 100 \]
where \(W_i\) is the
weight of the i-th item, \(P_{i, t}\) is the price of the i-th item in the
current year, and \(P_{i, 0}\) is the price of the i-th item in the base year.
**Why Weighted Index
Numbers are Preferred:**
1. **Reflecting
Importance:**
- Weighted index numbers reflect the relative importance of
different items. Items with higher weights have a more significant impact on
the overall index.
2. **Accurate
Representation:**
- In many cases, not all items in a group have the same
economic significance. Weighted index numbers provide a more accurate
representation of the true changes in the overall level.
3. **Dynamic
Nature:**
- Weighted indices can adapt to changes in the structure of
the economy or the consumption pattern by adjusting the weights.
4. **Avoiding
Misleading Conclusions:**
- Unweighted indices may provide misleading conclusions,
especially when items with different economic importance experience significant
price changes.
5. **Policy Decision
Support:**
- Weighted indices are more useful for policymakers as they
offer a more nuanced view of price changes, allowing for better-informed
decisions.
### b) Chain Index
Numbers:
Given Data:
\[ \begin{array}{cccccc}
\text{Commodities} & \text{Years} & 1970 & 1971
& 1972 & 1973 & 1974 \\
\hline
A & & 40 & 43 & 45 & 42 & 50 \\
B & & 160 & 162 & 165 & 161 & 168 \\
C & & 20 & 29 & 52 & 23 & 27 \\
D & & 240 & 245 & 247 & 250 & 255 \\
\end{array} \]
**Chain Index Numbers
Calculation using Geometric Mean to Average the Relatives:**
1. **Calculate Relatives:**
- Relatives are the ratios of the current year prices to the
prices of the previous year.
\[ R_{i,t} = \frac{P_{i,t}}{P_{i,t-1}} \]
2. **Calculate
Geometric Mean (GM):**
- Calculate the geometric mean of the relatives for each
commodity.
\[ GM_i = \left( \prod_{t=1}^{4} R_{i,t}
\right)^{\frac{1}{n}} \]
3. **Calculate Chain
Index Numbers:**
- Use the geometric
mean to calculate the chain index numbers.
\[ C_{i,t} = C_{i,t-1} \times GM_i \]
where \(C_{i,t-1}\) is the chain index for the previous
year.
\[ C_{i,1970} = 100 \] (Base Year)
\[ C_{i,1971} = C_{i,1970} \times GM_i \]
\[ C_{i,1972} = C_{i,1971} \times GM_i \]
\[ C_{i,1973} = C_{i,1972} \times GM_i \]
\[ C_{i,1974} = C_{i,1973} \times GM_i \]
**Results:**
\[ \begin{array}{cccccc}
\text{Commodities} & 1970 & 1971 & 1972 &
1973 & 1974 \\
\hline
A & 100 & 107.5 & 112.5 & 105 & 125 \\
B & 100 & 101.25 & 103.125 & 100.625 &
104.375 \\
C & 100 & 145 & 260 & 115 & 135 \\
D & 100 & 102.083 & 102.917 & 103.232 &
104.687 \\
\end{array} \]
These chain index numbers reflect the changes in prices for
each commodity relative to the base year (1970). The use of the geometric mean
ensures that the index is not sensitive to the choice of the base year,
providing a more meaningful comparison over time.
Dear Student,
Ye sample assignment h. Ye bilkul
copy paste h jo dusre student k pass b available h. Agr ap ne university
assignment send krni h to UNIQUE assignment
hasil krne k lye ham c contact kren:
0313-6483019
0334-6483019
0343-6244948
University c related har news c
update rehne k lye hamra channel subscribe kren: