Friday, October 14

Introduction to Business Statistics (1350) - Autumn 2022 - Assignment 1

Introduction to Business Statistics (1350)

Q. 1     a)         What do you understand by the term of Statistics? Give its chief characteristics.

statistics, the science of collecting, analyzing, presenting, and interpreting data. Governmental needs for census data as well as information about a variety of economic activities provided much of the early impetus for the field of statistics. Currently the need to turn the large amounts of data available in many applied fields into useful information has stimulated both theoretical and practical developments in statistics.

Dear Student,

Ye sample assignment h. Ye bilkul copy paste h jo dusre student k pass b available h. Agr ap ne university assignment send krni h to UNIQUE assignment hasil krne k lye ham c contact kren:

0313-6483019

0334-6483019

0343-6244948

University c related har news c update rehne k lye hamra channel subscribe kren:

AIOU Hub

 

Data are the facts and figures that are collected, analyzed, and summarized for presentation and interpretation. Data may be classified as either quantitative or qualitative. Quantitative data measure either how much or how many of something, and qualitative data provide labels, or names, for categories of like items. For example, suppose that a particular study is interested in characteristics such as age, gender, marital status, and annual income for a sample of 100 individuals. These characteristics would be called the variables of the study, and data values for each of the variables would be associated with each individual. Thus, the data values of 28, male, single, and $30,000 would be recorded for a 28-year-old single male with an annual income of $30,000. With 100 individuals and 4 variables, the data set would have 100 × 4 = 400 items. In this example, age and annual income are quantitative variables; the corresponding data values indicate how many years and how much money for each individual. Gender and marital status are qualitative variables. The labels male and female provide the qualitative data for gender, and the labels single, married, divorced, and widowed indicate marital status.

b)             Differentiate between descriptive and inferential statistics.

Definition of Descriptive Statistics

Descriptive Statistics refers to a discipline that quantitatively describes the important characteristics of the dataset. For the purpose of describing properties, it uses measures of central tendency, i.e. mean, median, mode and the measures of dispersion i.e. range, standard deviation, quartile deviation and variance, etc.

The data is summarised by the researcher, in a useful way, with the help of numerical and graphical tools such as charts, tables, and graphs, to represent data in an accurate way. Moreover, the text is presented in support of the diagrams, to explain what they represent.

Definition of Inferential Statistics

Inferential Statistics is all about generalising from the sample to the population, i.e. the results of the analysis of the sample can be deduced to the larger population, from which the sample is taken. It is a convenient way to draw conclusions about the population when it is not possible to query each and every member of the universe. The sample chosen is a representative of the entire population; therefore, it should contain important features of the population.

c)         Define a variable. Differentiate between a discrete and a continuous variable.

Variable refers to the quantity that changes its value, which can be measured. It is of two types, i.e. discrete or continuous variable. The former refers to the one that has a certain number of values, while the latter implies the one that can take any value between a given range.

Definition of Discrete Variable

A discrete variable is a type of statistical variable that can assume only fixed number of distinct values and lacks an inherent order.

Also known as a categorical variable, because it has separate, invisible categories. However no values can exist in-between two categories, i.e. it does not attain all the values within the limits of the variable. So, the number of permitted values that it can suppose is either finite or countably infinite. Hence if you are able to count the set of items, then the variable is said to be discrete.

Definition of Continuous Variable

Continuous variable, as the name suggest is a random variable that assumes all the possible values in a continuum. Simply put, it can take any value within the given range. So, if a variable can take an infinite and uncountable set of values, then the variable is referred as a continuous variable.

A continuous variable is one that is defined over an interval of values, meaning that it can suppose any values in between the minimum and maximum value. It can be understood as the function for the interval and for each function, the range for the variable may vary.

Q. 2     a)         What do you understand by classification and tabulation? Discuss their importance in a statistical analysis.

Definition of Classification

Classification refers to a process, wherein data is arranged based on the characteristic under consideration, into classes, or groups, as per resemblance of observations. Classification puts the data in a condensed form, as it removes unnecessary details that helps to easily comprehend data.

The data collected for the first time is raw data and so it is arranged in haphazard manner, which does not provide a clear picture. The classification of data reduces the large volume of raw data into homogeneous groups, i.e. data having common characteristics or nature are placed in one group and thus, the whole data is bifurcated into a number of groups. there are four types of classification:

Qualitative Classification or Ordinal Classification

Quantitative Classification

Chronological or Temporal Classification

Geographical or Spatial Classification

Definition of Tabulation

Tabulation refers to a logical data presentation, wherein raw data is summarized and displayed in a compact form, i.e. in statistical tables. In other words, it is a systematic arrangement of data in columns and rows, that represents data in concise and attractive way. One should follow the given guidelines for tabulation.

A serial number should be allotted to the table, in addition to the self explanatory title.

The statistical table is required to be divided into four parts, i.e. Box head, Stub, Caption and Body. The complete upper part of the table that contains columns and sub-columns, along with caption, is the Box Head. The left part of the table, giving description of rows is called stub. The part of table that contains numerical figures and other content is its body.

Length and Width of the table should be perfectly balanced.

Presentation of data should be such that it takes less time and labor to make comparison between various figures.

Footnotes, explaining the source of data or any other thing, are to be presented at the bottom of the table.

b)        Write down the important points for drawing graphs.   

Essential Elements of Good Graphs:

1.         A title which describes the experiment. Graphs are conventionally titled Y-AXIS VARIABLE VS. X-AXIS VARIABLE.  Note:  This almost always means that the title will be DEPENDENT VARIABLE VS. INDEPENDENT VARIABLE.

2.         The graph should fill the space allotted for the graph.  The graph should be as large as the paper will allow.  In order to do this, the graph must be properly scaled.  The scale for each axis of the graph should always begin at zero.  Each square on a given axis must represent the same amount.  Scale each axis so as to take up a maximum amount of the space available while still maintaining divisions which will make plotting the graph as easy as possible.  Increments on an axis of 1, 2, or 5 are easy to use when plotting points.  For larger numbers 10, 20 or 50 or possibly 100, 200, or 500 might work and so on.  For smaller numbers 0.1, 0.2, or 0.5 might work or maybe 0.01, 0.02, or 0.05.  A good way to choose the scale for an axis is to identify the largest data point that will be plotted on that axis.  Then count the number of squares on your graph paper that are available for plotting the variable on that axis.  Divide the maximum data value by the number of squares.  This will give you smallest value that each square could possible have as its increment. Since the result of this division will most likely not be a convenient number, you should then round up to the nearest convenient value.  Once you have chosen a scale, you do not have to label each square with its value.  Label enough values on the axis to make it clear what scale you are using on the axis.  

Scaling Example:  In a given experiment in which the Current is being measured, the largest length that was measured was 0.42 A.  The graph paper being used has 25 squares that could be used for the Current axis.  The minimum scaling increment for this graph paper would then be 0.42 A divided by 25 squares or 0.0168 A/Square. Since 0.0168 is not a convenient number to use for the increment (it would be very difficult to plot such a graph) then you should round the value up to the next higher convenient number.  This would be 0.02 in this case. You do not need to label each and every square.  You could, in this example, label every fifth square as 0.10, 0.20, and so on.

3.         Each axis should be labeled with the quantity being measured and the units of measurement.  The independent variable is usually plotted on the x-axis and the dependent variable is usually plotted on the y-axis.

4.         Each data point should be plotted in the proper position.  Use a tiny dot to mark your point, and circle the dot with a point protector so that it doesn’t get covered by your line of best fit.

5.         A line of best fit.  This line should show the overall tendency (or trend) of your data.  If the trend is linear, you should draw a straight line which shows that trend using a straight edge.  If the trend is a curve, you should sketch a curve which is your best guess as to the tendency of the data.  This line (whether straight or curved) does not have to go through all of the data points and it may, in some cases, not go through any of them.  NEVER connect your data points dot to dot. 

6.         If you are plotting the graph by hand, you will choose two points for all linear graphs from which to calculate the slope of the line of best fit.  These points should not be data points unless a data point happens to fall perfectly on the line of best fit.  Pick two points which are directly on your line of best fit and which are easy to read from the graph. Mark the points you have chosen with a +.

7.         Do not do other work in the space of your graph such as the slope calculation or other parts of the mathematical analysis.

8.         If your graph does not yield a straight line, you will be expected to manipulate one (or more) of the axes of your graph, replot the manipulated data, and continue doing this until a straight line results. In general it will probably not take more than three graphs to yield a straight line.

Summary--Characteristics of Good Graphs

1.         It is plotted on a grid (graph paper)

2.         The axes are highlighted (darker than the rest of the grid lines) and are drawn with a straight edge.

3.         Both axes are labeled with the variable name and its units.  Note that we do not label them x or y!

4.         The independent variable is plotted on the horizontal (x) axis.

5.         The dependent variable is plotted on the vertical (y) axis.

6.         The data points have point protectors

7.         A line of best fit is drawn which shows the trend of the data. The line of best fit may have some points above it, some below it, and some on it.  If the trend of the data is linear, the line of best fit is drawn with a straight edge. If the trend of the data is curved, a smooth curve should be drawn.

8.         The graph is clearly titled using the convention dependent variable vs. independent variable.

9.         The axes are properly scaled so that the graph fits the space, the grids are consistently scaled, and all of the data fits on the graph.  The graph should be as large as possible on the paper.

10.       The slope calculation points are clearly marked with a (+) on the line of best fit.

c)         What is a frequency distribution? How is it constructed?           (8+6+6)

The frequency of a value is the number of times it occurs in a dataset. A frequency distribution is the pattern of frequencies of a variable. It’s the number of times each possible value of a variable occurs in a dataset.

Types of frequency distributions

There are four types of frequency distributions:

Ungrouped frequency distributions: The number of observations of each value of a variable.

You can use this type of frequency distribution for categorical variables.

Grouped frequency distributions: The number of observations of each class interval of a variable. Class intervals are ordered groupings of a variable’s values.

You can use this type of frequency distribution for quantitative variables.

Relative frequency distributions: The proportion of observations of each value or class interval of a variable.

You can use this type of frequency distribution for any type of variable when you’re more interested in comparing frequencies than the actual number of observations.

Cumulative frequency distributions: The sum of the frequencies less than or equal to each value or class interval of a variable.

You can use this type of frequency distribution for ordinal or quantitative variables when you want to understand how often observations fall below certain values.

How to make a frequency table

Frequency distributions are often displayed using frequency tables. A frequency table is an effective way to summarize or organize a dataset. It’s usually composed of two columns:

The values or class intervals

Their frequencies

The method for making a frequency table differs between the four types of frequency distributions. You can follow the guides below or use software such as Excel, SPSS, or R to make a frequency table.

           

Q. 3     a)         Represent the following data by a bar diagram.   (8+12)

       

Classes

86-90

91-95

96-100

101-105

106-110

111-115

f

6

4

10

6

3

1

 

b)        Define Histogram. Draw a histogram for the following frequency distribution:

X

32

37

42

47

52

57

62

67

f

3

17

28

47

54

31

14

4

 

A histogram is a bar graph-like representation of data that buckets a range of classes into columns along the horizontal x-axis. The vertical y-axis represents the number count or percentage of occurrences in the data for each column. Columns can be used to visualize patterns of data distributions.

Q. 4     a)         What do you understand by weighted mean? In what circumstances is it preferred to ordinary mean and why?           

The weighted mean is a type of mean that is calculated by multiplying the weight (or probability) associated with a particular event or outcome with its associated quantitative outcome and then summing all the products together.

In some cases, you might want a number to have more weight. In that case, you’ll want to find the weighted mean. To find the weighted mean:

Multiply the numbers in your data set by the weights.

Add the results up.

For that set of number above with equal weights (1/5 for each number), the math to find the weighted mean would be:

1(*1/5) + 3(*1/5) + 5(*1/5) + 7(*1/5) + 10(*1/5) = 5.2.

b)        Discuss the merits and demerits of mean and mode.

MERITS OF  MEAN:

1-ARITHEMETIC MEAN RIGIDLY DEFINED BY ALGEBRIC FORMULA

2- It is easy to calculate and simple to understand

3- IT BASED ON ALL OBSERVATIONS AND IT CAN BE REGARDED AS REPRESENTATIVE OF THE GIVEN DATA

4- It is capable of being treated mathematically and hence it is widely used in statistical analysis.

5-Arithmetic mean can be computed even if the detailed distribution is not known but some of the observation and number of the observation are known.

6-It is least affected by the fluctuation of sampling

 

DEMERITS OF ARITHMETIC MEAN:

l-It can neither be determined by inspection or by graphical location

2-Arithmetic mean cannot be computed for qualitative data like data on intelligence honesty and smoking habit etc

3-It is too much affected by extreme observations and hence it is not adequately represent data consisting of some extreme point

4-Arithmetic mean cannot be computed when class intervals have open ends

 

Merits of Mode:

(1) Simple and popular: - Mode is very simple measure of central tendency. Sometimes, just at the series is enough to locate the model value. Because of its simplicity, it s a very popular measure of the central tendency.

(2) Less effect of marginal values: - Compared top mean, mode is less affected by marginal values in the series. Mode is determined only by the value with highest frequencies.

 (3) Graphic presentation:- Mode can be located graphically, with the help of histogram.

(4) Best representative: - Mode is that value which occurs most frequently in the series. Accordingly, mode is the best representative value of the series.

(5) No need of knowing all the items or frequencies: - The calculation of mode does not require knowledge of all the items and frequencies of a distribution. In simple series, it is enough if one knows the items with highest frequencies in the distribution.

 

Demerits of mode:

Following are the various demerits of mode:

(1) Uncertain and vague: - Mode is an uncertain and vague measure of the central tendency.

(2) Not capable of algebraic treatment: - Unlike mean, mode is not capable of further algebraic treatment.

(3) Difficult: - With frequencies of all items are identical, it is difficult to identify the modal value.

(4) Complex procedure of grouping:- Calculation of mode involves cumbersome procedure of grouping the data. If the extent of grouping changes there will be a change in the model value.

(5) Ignores extreme marginal frequencies:- It ignores extreme marginal frequencies. To that extent model value is not a representative value of all the items in a series.

c)         Explain the factors which we consider in selection of suitable measure of central tendency.       (7+8+5)

Mean is generally considered the best measure of central tendency and the most frequently used one. However, there are some situations where the other measures of central tendency are preferred.

Median is preferred to mean when

There are few extreme scores in the distribution.

Some scores have undetermined values.

There is an open ended distribution.

Data are measured in an ordinal scale.

Mode is the preferred measure when data are measured in a nominal scale. Geometric mean is the preferred measure of central tendency when data are measured in a logarithmic scale.      

Q. 5     a)         Compute the Mean, median and mode for the following data;

Classes

86-90

91-95

96-100

101-105

106-110

111-115

f

6

4

10

6

3

1

 

Let N denotes the total number of observations.

here, N=11

The given observations are: 89, 71, 64, 90, 94, 88, 77, 90, 65, 83, 89

Calculating Mean:

Mean=(Sum of all obsevations)/N

Mean=(89+71+64+90+94+88+77+90+65+83+89)/11

Mean=81.81818

Calculating Median

Arrange the data into Ascending order
64 65 71 77 83 88 89 89 90 90 94
Median={(N+1)/2}th observation (from the obsevations arranged in ascending order)

here, Median={(11+1)/2}th obsevation

Median=6th obsevation

Median= 88

Calculating Mode

Mode= most repeated observation in the data set or the observation having highest frequency

obsevation

frequency

89

2

71

1

64

1

90

2

94

1

88

1

77

1

65

1

83

1

from the above frequency table we see that the observations 89 and 90 have frequency 2 each which is the highest frequency

Hence, the data set id Bimodal(i.e. has two modes) with modes 89 and 90

Mode= 89 and 90

 

b)        Define Geometric mean an its properties. (15+5)

In mathematics and statistics, the summary that describes the whole data set values can be easily described with the help of measures of central tendencies. The most important measures of central tendencies are mean, median, mode and the range. Among these, the mean of the data set will provide the overall idea of the data. The mean defines the average of numbers. The different types of mean are Arithmetic Mean (AM), Geometric Mean (GM) and Harmonic Mean (HM). In this article, let us discuss the definition, formula, properties, applications, the relation between AM, GM, and HM with solved examples in detail.

In Mathematics, the Geometric Mean (GM) is the average value or mean which signifies the central tendency of the set of numbers by finding the product of their values. Basically, we multiply the numbers altogether and take the nth root of the multiplied numbers, where n is the total number of data values. For example: for a given set of two numbers such as 3 and 1, the geometric mean is equal to √(3×1) = √3 = 1.732.

 

In other words, the geometric mean is defined as the nth root of the product of n numbers. It is noted that the geometric mean is different from the arithmetic mean. Because, in arithmetic mean, we add the data values and then divide it by the total number of values. But in geometric mean, we multiply the given data values and then take the root with the radical index for the total number of data values. For example, if we have two data, take the square root, or if we have three data, then take the cube root, or else if we have four data values, then take the 4th root, and so on.

Some of the important properties of the G.M are:

 

The G.M for the given data set is always less than the arithmetic mean for the data set

If each object in the data set is substituted by the G.M, then the product of the objects remains unchanged.

The ratio of the corresponding observations of the G.M in two series is equal to the ratio of their geometric means

The products of the corresponding items of the G.M in two series are equal to the product of their geometric mean.

Dear Student,

Ye sample assignment h. Ye bilkul copy paste h jo dusre student k pass b available h. Agr ap ne university assignment send krni h to UNIQUE assignment hasil krne k lye ham c contact kren:

0313-6483019

0334-6483019

0343-6244948

University c related har news c update rehne k lye hamra channel subscribe kren:

AIOU Hub