Introduction to Business Statistics (1350)
Q. 1 a) What do you understand by the term of Statistics? Give its chief characteristics.
statistics, the science of collecting, analyzing, presenting, and
interpreting data. Governmental needs for census data as well as information
about a variety of economic activities provided much of the early impetus for
the field of statistics. Currently the need to turn the large amounts of data
available in many applied fields into useful information has stimulated both theoretical
and practical developments in statistics.
Dear Student,
Ye sample assignment h. Ye bilkul
copy paste h jo dusre student k pass b available h. Agr ap ne university
assignment send krni h to UNIQUE assignment
hasil krne k lye ham c contact kren:
0313-6483019
0334-6483019
0343-6244948
University c related har news c
update rehne k lye hamra channel subscribe kren:
Data are the facts and figures that are collected, analyzed, and
summarized for presentation and interpretation. Data may be classified as
either quantitative or qualitative. Quantitative data measure either how much
or how many of something, and qualitative data provide labels, or names, for
categories of like items. For example, suppose that a particular study is
interested in characteristics such as age, gender, marital status, and annual
income for a sample of 100 individuals. These characteristics would be called
the variables of the study, and data values for each of the variables would be
associated with each individual. Thus, the data values of 28, male, single, and
$30,000 would be recorded for a 28-year-old single male with an annual income
of $30,000. With 100 individuals and 4 variables, the data set would have 100 ×
4 = 400 items. In this example, age and annual income are quantitative
variables; the corresponding data values indicate how many years and how much
money for each individual. Gender and marital status are qualitative variables.
The labels male and female provide the qualitative data for gender, and the
labels single, married, divorced, and widowed indicate marital status.
b) Differentiate
between descriptive and inferential statistics.
Definition of Descriptive Statistics
Descriptive Statistics refers to a discipline that quantitatively
describes the important characteristics of the dataset. For the purpose of
describing properties, it uses measures of central tendency, i.e. mean, median,
mode and the measures of dispersion i.e. range, standard deviation, quartile
deviation and variance, etc.
The data is summarised by the researcher, in a useful way, with the
help of numerical and graphical tools such as charts, tables, and graphs, to
represent data in an accurate way. Moreover, the text is presented in support
of the diagrams, to explain what they represent.
Definition of Inferential Statistics
Inferential Statistics is all about generalising from the sample to
the population, i.e. the results of the analysis of the sample can be deduced
to the larger population, from which the sample is taken. It is a convenient
way to draw conclusions about the population when it is not possible to query
each and every member of the universe. The sample chosen is a representative of
the entire population; therefore, it should contain important features of the
population.
c) Define a variable.
Differentiate between a discrete and a continuous variable.
Variable refers to the quantity that changes its value, which can
be measured. It is of two types, i.e. discrete or continuous variable. The
former refers to the one that has a certain number of values, while the latter
implies the one that can take any value between a given range.
Definition of Discrete Variable
A discrete variable is a type of statistical variable that can
assume only fixed number of distinct values and lacks an inherent order.
Also known as a categorical variable, because it has separate,
invisible categories. However no values can exist in-between two categories,
i.e. it does not attain all the values within the limits of the variable. So,
the number of permitted values that it can suppose is either finite or
countably infinite. Hence if you are able to count the set of items, then the
variable is said to be discrete.
Definition of Continuous Variable
Continuous variable, as the name suggest is a random variable that
assumes all the possible values in a continuum. Simply put, it can take any
value within the given range. So, if a variable can take an infinite and
uncountable set of values, then the variable is referred as a continuous
variable.
A continuous variable is one that is defined over an interval of
values, meaning that it can suppose any values in between the minimum and
maximum value. It can be understood as the function for the interval and for
each function, the range for the variable may vary.
Q. 2 a) What do you understand by classification
and tabulation? Discuss their importance in a statistical analysis.
Definition of Classification
Classification refers to a process, wherein data is arranged based
on the characteristic under consideration, into classes, or groups, as per
resemblance of observations. Classification puts the data in a condensed form,
as it removes unnecessary details that helps to easily comprehend data.
The data collected for the first time is raw data and so it is
arranged in haphazard manner, which does not provide a clear picture. The
classification of data reduces the large volume of raw data into homogeneous
groups, i.e. data having common characteristics or nature are placed in one
group and thus, the whole data is bifurcated into a number of groups. there are
four types of classification:
Qualitative Classification or Ordinal Classification
Quantitative Classification
Chronological or Temporal Classification
Geographical or Spatial Classification
Definition of Tabulation
Tabulation refers to a logical data presentation, wherein raw data
is summarized and displayed in a compact form, i.e. in statistical tables. In
other words, it is a systematic arrangement of data in columns and rows, that
represents data in concise and attractive way. One should follow the given
guidelines for tabulation.
A serial number should be allotted to the table, in addition to the
self explanatory title.
The statistical table is required to be divided into four parts,
i.e. Box head, Stub, Caption and Body. The complete upper part of the table
that contains columns and sub-columns, along with caption, is the Box Head. The
left part of the table, giving description of rows is called stub. The part of table
that contains numerical figures and other content is its body.
Length and Width of the table should be perfectly balanced.
Presentation of data should be such that it takes less time and
labor to make comparison between various figures.
Footnotes, explaining the source of data or any other thing, are to
be presented at the bottom of the table.
b) Write down the
important points for drawing graphs.
Essential Elements of Good Graphs:
1. A title which
describes the experiment. Graphs are conventionally titled Y-AXIS VARIABLE VS.
X-AXIS VARIABLE. Note: This almost always means that the title will
be DEPENDENT VARIABLE VS. INDEPENDENT VARIABLE.
2. The graph should
fill the space allotted for the graph.
The graph should be as large as the paper will allow. In order to do this, the graph must be
properly scaled. The scale for each axis
of the graph should always begin at zero.
Each square on a given axis must represent the same amount. Scale each axis so as to take up a maximum
amount of the space available while still maintaining divisions which will make
plotting the graph as easy as possible.
Increments on an axis of 1, 2, or 5 are easy to use when plotting
points. For larger numbers 10, 20 or 50
or possibly 100, 200, or 500 might work and so on. For smaller numbers 0.1, 0.2, or 0.5 might
work or maybe 0.01, 0.02, or 0.05. A
good way to choose the scale for an axis is to identify the largest data point
that will be plotted on that axis. Then
count the number of squares on your graph paper that are available for plotting
the variable on that axis. Divide the
maximum data value by the number of squares.
This will give you smallest value that each square could possible have
as its increment. Since the result of this division will most likely not be a
convenient number, you should then round up to the nearest convenient
value. Once you have chosen a scale, you
do not have to label each square with its value. Label enough values on the axis to make it
clear what scale you are using on the axis.
Scaling Example: In a given
experiment in which the Current is being measured, the largest length that was
measured was 0.42 A. The graph paper
being used has 25 squares that could be used for the Current axis. The minimum scaling increment for this graph
paper would then be 0.42 A divided by 25 squares or 0.0168 A/Square. Since
0.0168 is not a convenient number to use for the increment (it would be very
difficult to plot such a graph) then you should round the value up to the next
higher convenient number. This would be
0.02 in this case. You do not need to label each and every square. You could, in this example, label every fifth
square as 0.10, 0.20, and so on.
3. Each axis should be
labeled with the quantity being measured and the units of measurement. The independent variable is usually plotted
on the x-axis and the dependent variable is usually plotted on the y-axis.
4. Each data point
should be plotted in the proper position.
Use a tiny dot to mark your point, and circle the dot with a point protector
so that it doesn’t get covered by your line of best fit.
5. A line of best
fit. This line should show the overall
tendency (or trend) of your data. If the
trend is linear, you should draw a straight line which shows that trend using a
straight edge. If the trend is a curve,
you should sketch a curve which is your best guess as to the tendency of the
data. This line (whether straight or
curved) does not have to go through all of the data points and it may, in some
cases, not go through any of them. NEVER
connect your data points dot to dot.
6. If you are plotting
the graph by hand, you will choose two points for all linear graphs from which
to calculate the slope of the line of best fit.
These points should not be data points unless a data point happens to
fall perfectly on the line of best fit.
Pick two points which are directly on your line of best fit and which
are easy to read from the graph. Mark the points you have chosen with a +.
7. Do not do other
work in the space of your graph such as the slope calculation or other parts of
the mathematical analysis.
8. If your graph does
not yield a straight line, you will be expected to manipulate one (or more) of
the axes of your graph, replot the manipulated data, and continue doing this
until a straight line results. In general it will probably not take more than
three graphs to yield a straight line.
Summary--Characteristics of Good Graphs
1. It is plotted on a
grid (graph paper)
2. The axes are
highlighted (darker than the rest of the grid lines) and are drawn with a
straight edge.
3. Both axes are
labeled with the variable name and its units.
Note that we do not label them x or y!
4. The independent
variable is plotted on the horizontal (x) axis.
5. The dependent
variable is plotted on the vertical (y) axis.
6. The data points
have point protectors
7. A line of best fit
is drawn which shows the trend of the data. The line of best fit may have some
points above it, some below it, and some on it.
If the trend of the data is linear, the line of best fit is drawn with a
straight edge. If the trend of the data is curved, a smooth curve should be
drawn.
8. The graph is
clearly titled using the convention dependent variable vs. independent
variable.
9. The axes are
properly scaled so that the graph fits the space, the grids are consistently
scaled, and all of the data fits on the graph.
The graph should be as large as possible on the paper.
10. The slope
calculation points are clearly marked with a (+) on the line of best fit.
c) What is a frequency
distribution? How is it constructed? (8+6+6)
The frequency of a value is the number of times it occurs in a
dataset. A frequency distribution is the pattern of frequencies of a variable.
It’s the number of times each possible value of a variable occurs in a dataset.
Types of frequency distributions
There are four types of frequency distributions:
Ungrouped frequency distributions: The number of observations of
each value of a variable.
You can use this type of frequency distribution for categorical
variables.
Grouped frequency distributions: The number of observations of each
class interval of a variable. Class intervals are ordered groupings of a
variable’s values.
You can use this type of frequency distribution for quantitative
variables.
Relative frequency distributions: The proportion of observations of
each value or class interval of a variable.
You can use this type of frequency distribution for any type of
variable when you’re more interested in comparing frequencies than the actual
number of observations.
Cumulative frequency distributions: The sum of the frequencies less
than or equal to each value or class interval of a variable.
You can use this type of frequency distribution for ordinal or
quantitative variables when you want to understand how often observations fall
below certain values.
How to make a frequency table
Frequency distributions are often displayed using frequency tables.
A frequency table is an effective way to summarize or organize a dataset. It’s
usually composed of two columns:
The values or class intervals
Their frequencies
The method for making a frequency table differs between the four
types of frequency distributions. You can follow the guides below or use
software such as Excel, SPSS, or R to make a frequency table.
Q. 3 a) Represent the following data by a bar
diagram. (8+12)
Classes |
86-90 |
91-95 |
96-100 |
101-105 |
106-110 |
111-115 |
f |
6 |
4 |
10 |
6 |
3 |
1 |
b) Define Histogram.
Draw a histogram for the following frequency distribution:
X |
32 |
37 |
42 |
47 |
52 |
57 |
62 |
67 |
f |
3 |
17 |
28 |
47 |
54 |
31 |
14 |
4 |
A histogram is a bar graph-like representation of data that buckets
a range of classes into columns along the horizontal x-axis. The vertical
y-axis represents the number count or percentage of occurrences in the data for
each column. Columns can be used to visualize patterns of data distributions.
Q. 4 a) What do you understand by weighted mean?
In what circumstances is it preferred to ordinary mean and why?
The weighted mean is a type of mean that is calculated by
multiplying the weight (or probability) associated with a particular event or
outcome with its associated quantitative outcome and then summing all the
products together.
In some cases, you might want a number to have more weight. In that
case, you’ll want to find the weighted mean. To find the weighted mean:
Multiply the numbers in your data set by the weights.
Add the results up.
For that set of number above with equal weights (1/5 for each
number), the math to find the weighted mean would be:
1(*1/5) + 3(*1/5) + 5(*1/5) + 7(*1/5) + 10(*1/5) = 5.2.
b) Discuss the merits
and demerits of mean and mode.
MERITS OF MEAN:
1-ARITHEMETIC MEAN RIGIDLY DEFINED BY ALGEBRIC FORMULA
2- It is easy to calculate and simple to understand
3- IT BASED ON ALL OBSERVATIONS AND IT CAN BE REGARDED AS
REPRESENTATIVE OF THE GIVEN DATA
4- It is capable of being treated mathematically and hence it is
widely used in statistical analysis.
5-Arithmetic mean can be computed even if the detailed distribution
is not known but some of the observation and number of the observation are
known.
6-It is least affected by the fluctuation of sampling
DEMERITS OF ARITHMETIC MEAN:
l-It can neither be determined by inspection or by graphical
location
2-Arithmetic mean cannot be computed for qualitative data like data
on intelligence honesty and smoking habit etc
3-It is too much affected by extreme observations and hence it is
not adequately represent data consisting of some extreme point
4-Arithmetic mean cannot be computed when class intervals have open
ends
Merits of Mode:
(1) Simple and popular: - Mode is very simple measure of central
tendency. Sometimes, just at the series is enough to locate the model value.
Because of its simplicity, it s a very popular measure of the central tendency.
(2) Less effect of marginal values: - Compared top mean, mode is
less affected by marginal values in the series. Mode is determined only by the
value with highest frequencies.
(3) Graphic presentation:-
Mode can be located graphically, with the help of histogram.
(4) Best representative: - Mode is that value which occurs most
frequently in the series. Accordingly, mode is the best representative value of
the series.
(5) No need of knowing all the items or frequencies: - The
calculation of mode does not require knowledge of all the items and frequencies
of a distribution. In simple series, it is enough if one knows the items with
highest frequencies in the distribution.
Demerits of mode:
Following are the various demerits of mode:
(1) Uncertain and vague: - Mode is an uncertain and vague measure
of the central tendency.
(2) Not capable of algebraic treatment: - Unlike mean, mode is not
capable of further algebraic treatment.
(3) Difficult: - With frequencies of all items are identical, it is
difficult to identify the modal value.
(4) Complex procedure of grouping:- Calculation of mode involves
cumbersome procedure of grouping the data. If the extent of grouping changes
there will be a change in the model value.
(5) Ignores extreme marginal frequencies:- It ignores extreme
marginal frequencies. To that extent model value is not a representative value
of all the items in a series.
c) Explain the factors
which we consider in selection of suitable measure of central tendency. (7+8+5)
Mean is generally considered the best measure of central tendency
and the most frequently used one. However, there are some situations where the
other measures of central tendency are preferred.
Median is preferred to mean when
There are few extreme scores in the distribution.
Some scores have undetermined values.
There is an open ended distribution.
Data are measured in an ordinal scale.
Mode is the preferred measure when data are measured in a nominal
scale. Geometric mean is the preferred measure of central tendency when data
are measured in a logarithmic scale.
Q. 5 a) Compute the Mean, median and mode for
the following data;
Classes |
86-90 |
91-95 |
96-100 |
101-105 |
106-110 |
111-115 |
f |
6 |
4 |
10 |
6 |
3 |
1 |
Let N denotes the total number of observations.
here, N=11
The given observations are: 89, 71, 64, 90, 94,
88, 77, 90, 65, 83, 89
Calculating Mean:
Mean=(Sum of all obsevations)/N
Mean=(89+71+64+90+94+88+77+90+65+83+89)/11
Mean=81.81818
Calculating
Median
Arrange the data into Ascending order
64 65 71 77 83 88 89 89 90 90 94
Median={(N+1)/2}th observation (from the obsevations arranged in ascending
order)
here, Median={(11+1)/2}th obsevation
Median=6th obsevation
Median= 88
Calculating
Mode
Mode= most repeated observation in the data set
or the observation having highest frequency
obsevation |
frequency |
89 |
2 |
71 |
1 |
64 |
1 |
90 |
2 |
94 |
1 |
88 |
1 |
77 |
1 |
65 |
1 |
83 |
1 |
from the above frequency table we see that the
observations 89 and 90 have frequency 2 each which is the highest frequency
Hence, the data set id Bimodal(i.e. has two
modes) with modes 89 and 90
Mode= 89 and 90
b) Define Geometric
mean an its properties. (15+5)
In mathematics and statistics, the summary that describes the whole
data set values can be easily described with the help of measures of central
tendencies. The most important measures of central tendencies are mean, median,
mode and the range. Among these, the mean of the data set will provide the
overall idea of the data. The mean defines the average of numbers. The
different types of mean are Arithmetic Mean (AM), Geometric Mean (GM) and
Harmonic Mean (HM). In this article, let us discuss the definition, formula,
properties, applications, the relation between AM, GM, and HM with solved
examples in detail.
In Mathematics, the Geometric Mean (GM) is the average value or
mean which signifies the central tendency of the set of numbers by finding the
product of their values. Basically, we multiply the numbers altogether and take
the nth root of the multiplied numbers, where n is the total number of data
values. For example: for a given set of two numbers such as 3 and 1, the
geometric mean is equal to √(3×1) = √3 = 1.732.
In other words, the geometric mean is defined as the nth root of
the product of n numbers. It is noted that the geometric mean is different from
the arithmetic mean. Because, in arithmetic mean, we add the data values and
then divide it by the total number of values. But in geometric mean, we
multiply the given data values and then take the root with the radical index
for the total number of data values. For example, if we have two data, take the
square root, or if we have three data, then take the cube root, or else if we
have four data values, then take the 4th root, and so on.
Some of the important properties of the G.M are:
The G.M for the given data set is always less than the arithmetic
mean for the data set
If each object in the data set is substituted by the G.M, then the
product of the objects remains unchanged.
The ratio of the corresponding observations of the G.M in two
series is equal to the ratio of their geometric means
The products of the corresponding items of the G.M in two series
are equal to the product of their geometric mean.
Dear Student,
Ye sample assignment h. Ye bilkul
copy paste h jo dusre student k pass b available h. Agr ap ne university
assignment send krni h to UNIQUE assignment
hasil krne k lye ham c contact kren:
0313-6483019
0334-6483019
0343-6244948
University c related har news c
update rehne k lye hamra channel subscribe kren: