Salary Analysis

Our goal is to analyze

salaries in a company based on a survey had happened among its employees. 378

employees took part in our survey. In the survey asked employees about the

following things:

·

Age (in years)

·

Gender

·

Job seniority (in years)

·

Department (office,

production, sales)

·

Salary

We want to get some

information about distribution above variables and check which variables have

the greatest impact on salaries.

Gender

Let’s see gender

frequency distribution looks like in this company:

Man

240

Woman

138

More than half of the

company’s employees are men – our company is involved in the production of a

certain product, so employs many employees which working physically – hence a

large number of men in company.

Department

As it was mentioned at

the beginning, our company have three departments – office, production and

sales. Let’s see how many people work in each of them.

Office

24

Production

274

Sales

80

As we expected – the

vast majority of employees are production employees – around 70 percent of all

employees.

Age

The next variable that

was asked in the survey was the employee’s age. Let’s count the basic

statistics for this variable:

Statistic

Value

Number

378

Central tendency

Mean

40,1

Median

40

Mode

31

Measures of position

Minimum

23

Maximum

62

First quartile

33

Third quartile

47

Measures of variability

Range

39

Inter-quartile Range

7

Variance

77,97

Standard Deviation

8,83

Coefficient of variation

0,22

Quartile coefficient of dispersion

0,18

Skewness

0,27

Kurtosis

-0,65

The average age of the

employee was 40.1 years. Half of the employees were 40 years old or less and

the largest number of employees participating in the study was 31 years old.

The youngest was 23 years old, the oldest – 63 years old. One-fourth of the youngest

employees were aged 33 and under, one-quarter of the oldest employees were aged

40 and over.

Observed values of the variable vary

around 8, 83 years of the average value, about 22%.

Skewness it is close to zero, so the age distribution can be considered as

symmetrical.

Negative kurtosis speaks of a larger dispersion of results compared to the

standard normal distribution.

Let’s look how it looks

box-plot of age:

As it was mentioned

earlier, the distribution of age can be considered symmetrical in relation to

the average.

Let’s look how it looks histogram of age:

Histogram shows a slight

right-sided asymmetry in the age distribution.

Job seniority

Let us now analyze the

employees’ seniority:

Statistic

Value

Number

378

Central tendency

Mean

11,2

Median

11

Mode

9

Measures of position

Minimum

0

Maximum

36

First

quartile

6

Third

quartile

15

Measures of variability

Range

36

Inter-quartile

Range

5

Variance

50,52

Standard

Deviation

7,11

Coefficient

of variation

0,63

Quartile

coefficient of dispersion

0,43

Skewness

0,73

Kurtosis

0,48

The average job

seniority of the employee was 11.2 years. Half of the employees work in the

company for 11 years and more, while the most employees work with them have

been working in the company for 9 years. Minimum of job seniority is 0, the

longest working employee has been working in the company for 36 years. Quarter

of employees work in the company for 6 years and shorter, quarter of employees

work in the company for 15 years and longer.

Observed values of the variable vary

around 7,11 years of the average value, about 63%, therefore, the variability

of seniority is high.

The skewness is positive so the distribution of seniority is positive skew- the

mass of the distribution is concentrated on the left of the figure.

Positive kurtosis says that the distribution is leptokurtic – the distribution

is more concentrated than the standard normal distribution.

Box-plot:

Histogram:

As we can see – the

internship distribution at the company is right-angled, so the majority of

employees are inexperienced.

Salary

Let us now analyze the

employees’ salaries:

Statistic

Value

Number

378

Central tendency

Mean

3634,1

Median

3300

Mode

3400

Measures of position

Minimum

1700

Maximum

19500

First

quartile

2700

Third

quartile

3950

Measures of variability

Range

17800

Inter-quartile

Range

625

Variance

3146299,10

Standard

Deviation

1773,78

Coefficient

of variation

0,49

Quartile

coefficient of dispersion

0,19

Skewness

4,07

Kurtosis

25,60

The average salary in

this company was 3634,1 PLN. Half of employees earn 3300 and less, the most

people earn 3400. The person who earning the least earns 1700, the most –

19500. Quarter of employees earn 2700 or less, one fourth of the earners earn

the most earn 3950 and more.

Observed values of the variable vary

around 1773,78 PLN of the average value, about 49%, therefore, so salaries are

moderately varied.

The skewness is strong

positive so the distribution of seniority is positive skew- employees with

below-average earnings prevail.

Strong positive kurtosis

says that the distribution is leptokurtic – the distribution is significantly

more concentrated than the standard normal distribution.

Box-plot:

Histogram:

On the charts you can

see what we wrote above – strong right-sided asymmetry of the distribution.

Correlation and regression

The only quantitative

variables in the study were age, seniority and salaries. We want to check

whether we are able to estimate salaries with the help of age or seniority. For

this purpose, let’s count the Pearson correlation coefficient between salaries

and age and seniority:

Pearson correlation coefficient with

Earnings

Age

Job seniority

0,405

0,413

We see that both age and

internship can explain the variability of earnings equally. So let’s build a

linear regression model where the dependent variable will be salaries (Y) and

an independent seniority (X).

As we can see the relationship

between salaries and seniority, we can describe the formula:

So, if seniority increases by one year,

the salary will increase by 103,19 PLN.

New person at work will get averagely 2473,3 PLN.

And for example – a person with 10 years of work experience should earn an

average 3505,2 PLN.

Seniority can explain the variability of

salaries in 17% – this is a weak dependency.

Estimation

We will try to perform

point and interval estimation for the salaries of employees.

Average

·

Point estimation

? = 3634,1 ± 91,2

·

Interval estimation –

level of confidence = 0,95

P(3455,3

< µ < 3812,9) = 0.95
Therefore, at 95% the
average salaries in this company are in the range from 4716,6 to 5074,3.
Standard deviation
·
Point estimation
? = 1773,78
·
Interval estimation -
level of confidence = 0,95
P(1655,71<
?
< 1910,13) = 0.95
Therefore, at 95%
standard deviation of salaries in this company are in the range from 1655,71
to 1910,13.
Hypothesis
Let's check whether
women and men earn the same amount in this company:
?1 - woman salaries
?2 -man salaries
Woman Salaries
Man Salaries
Mean
3712,68
3588,96
Standard deviation
1832,00
1741,70
number of observations
138
240
Z
0,644
p-value
0,520
Therefore, at the
significance level of 0.05, we can say that there is no reason to argue that
men and women earn different amounts.