Skip to content

hackerwhale/Employee-Attrition-Analysis-using-R

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Employee Attrition Analysis using R

Objective

The main objective of this project was to analyze employee attrition patterns within an organization using data analytics techniques in R. The goal was to understand employee behavior, identify key factors influencing retention and termination, and extract meaningful insights that can support HR decision-making and business planning.


Project Overview

This project is based on a single dataset: employee_attrition.csv, which contains employee-related information such as demographics, job roles, departments, store locations, length of service, employment status, and termination details.

Using this dataset, I performed a complete data analysis workflow including data cleaning, preprocessing, exploratory data analysis (EDA), and visualization using R programming.


What I Did

  • Imported the employee attrition dataset into R for analysis
  • Cleaned and structured the dataset by removing unnecessary columns
  • Renamed variables for better readability and consistency
  • Converted raw data types into appropriate formats (factors and dates)
  • Explored the dataset to understand structure, size, and variables
  • Performed statistical summarization to identify trends and patterns
  • Created multiple visualizations to analyze relationships between variables

How I Did It (Methodology)

The analysis was conducted using R programming with libraries such as dplyr, ggplot2, readr, and tidyverse.

1. Data Import

The dataset was loaded using read_csv() and verified using View() and str() functions.

2. Data Cleaning

  • Removed irrelevant columns that were not needed for analysis
  • Renamed key variables for clarity
  • Ensured dataset consistency for analysis

3. Data Preprocessing

  • Converted categorical variables into factors
  • Standardized date formats for consistency
  • Prepared data for grouping and visualization

4. Exploratory Data Analysis (EDA)

  • Used summary(), nrow(), ncol(), and names() to understand dataset structure
  • Identified patterns in employee distribution, tenure, and termination behavior

5. Data Visualization

Created visual insights using ggplot2, including:

  • Heatmaps for average length of service
  • Boxplots for tenure distribution across departments and age groups
  • Bar charts for employee status by city and year
  • Scatter plots for relationship between service length and termination type
  • Stacked bar charts for department-wise attrition patterns

Thought Process

My approach was to first understand the dataset structure before performing any analysis. I focused on cleaning and transforming the data into a usable format because accurate analysis depends heavily on data quality.

After preprocessing, I explored the data to identify meaningful relationships between employee attributes such as age, department, job role, and termination type. The goal was not just to create charts, but to understand underlying workforce patterns.

Finally, I visualized the data in a way that makes complex relationships easier to interpret, especially for HR and business decision-making purposes.


Key Insights

  • Employee tenure varies significantly across departments and job roles
  • Senior and executive roles generally show higher retention
  • Sales and operational roles show higher turnover rates
  • Retirement is strongly associated with higher age groups
  • Certain departments experience more terminations than others, indicating retention challenges

Tools & Technologies

  • R Programming
  • dplyr
  • ggplot2
  • readr
  • tidyverse

Project Summary

This project demonstrates a complete data analytics workflow using R on an HR dataset. It includes data cleaning, transformation, exploratory analysis, and visualization to extract actionable insights about employee attrition and workforce behavior.

The analysis helps identify patterns in employee retention and termination, which can be used to improve HR strategies and organizational decision-making.

About

This project demonstrates an end-to-end Human Resources (HR) analytics workflow using R. The objective was to analyze workforce data and uncover insights related to employee retention, attrition, tenure, organizational structure, and termination patterns. The dataset contains over 49,000 employee records and other information.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages