ChatGPT for HR Analytics - Employee Turnover

Tatenda Emma Matika

Jan 31, 2023 • 8 min read

Image from https://www.aihr.com/blog/high-turnover-meaning-rates/

ChatGPT, developed by OpenAI, is a cutting-edge language model that uses advanced deep-learning algorithms to generate human-like responses to various questions and prompts. With the ability to understand and process large amounts of data, ChatGPT is now being utilized to analyze complex business problems and provide valuable insights. One area where ChatGPT can provide valuable insights is in HR Analytics.

Employee turnover is a key metric in HR Analytics which refers to the rate at which employees leave an organization and are replaced by new hires. High turnover can be an indicator of low job satisfaction, poor company culture, low compensation, or other factors that are causing employees to leave.

Employee turnover is a critical issue that can impact an organization's productivity, morale, and bottom line. To effectively address turnover, organizations must have a clear understanding of its causes and patterns. In this article, we explore how organizations can use ChatGPT to analyze employee turnover data and gain insights into its drivers.

Method

The first step in using ChatGPT for employee turnover analysis is to gather relevant data. This may include information on the number of employees who have left the organization, the reasons for their departure, and other data points such as their job function, length of service, and demographic information. This data can then be preprocessed to ensure that it is in a format that can be easily processed by the ChatGPT API.

Once the data has been preprocessed, the next step is to use the OpenAI API to send text inputs to ChatGPT and receive corresponding text outputs. We will explore 3 ways to do this:

Using the ChatGPT web interface
Generating code from the ChatGPT interface and running it ourselves
Setting up an API endpoint and writing code to interact with the ChatGPT model

For options 1 and 2, the ChatGPT web interface is all that is needed. For option 3, organizations will need to obtain an API key from OpenAI and set up the API endpoint to interact with the ChatGPT model.

Once the API endpoint is set up, HTTP requests can be sent to the API to send text inputs and receive text outputs. For example, a request for a text input like "What are the most common reasons for employee turnover?" could be sent to the API, and a text output like "The most common reasons for employee turnover are poor work environment, low pay, and lack of career advancement opportunities" can be received. But remember, this is only possible if this information is in the dataset.

Getting data

The data used for this article was downloaded from an Employee Turnover project on Kaggle. The data is in CSV format. If the data is in a different format, it is best to convert the dataset into a text format such as a CSV file which can be viewed as a plain text file. This is because at the moment, ChatGPT works with text input only.

Example 1: Using the web interface

We can open the CSV file using a text editor of our choice. In this case, we can use Notepad.

In the actual dataset, the 1st column is named stag, the 2nd one is named event and the 7th column is named traffic. This is confusing for ChatGPT unless we tell it what the column names mean. To simplify this, we rename the columns to names which better represent what the column is about.

We copy the dataset and paste it into the ChatGPT chat interface. For this example, we copy only the first 100 rows.

Important: There is a limit to the size of the input one can enter. So, this method is only useful for small datasets but not recommended for bigger datasets. You can test this on your own to see how much of your dataset can be input.

We paste the dataset, including a prompt of what we want ChatGPT to do with the dataset.

We get the following result:

We can see that only 18 rows were taken in as input, which is only a small subset of our dataset. Nonetheless, for the purpose of demonstration, let us proceed as if the data was limited to 18 rows. The result we obtained is not enough for our analysis. We need more insights, so we add another prompt.

Yay! Now we have the results that we want. This is enough to write a short summary and present in a meeting at the office. We can then use the results of the analysis to brainstorm and come up with ways to understand for example why turnover is higher for the HR department and how we can address that.

The method in this example is not recommended, but it helps to come up with questions that could be of interest in our analysis. It also helps us to structure our report. We can see that our results have been grouped into 4 categories which is very useful.

One thing to note is that it is not a good idea to accept everything that ChatGPT says without verifying it. Always try to verify the output so that there are no mistakes that could cost your organization or your job.

Example 2: Generating code from the ChatGPT interface

From the 1st example, we can see that there is a limit on the size of input to the chat interface. This is not a problem. We can use the chat to give ChatGPT an idea of what the dataset looks like, the analysis we want to do and prompt it to write code in our preferred programming language, which in this case is Python.

We have to be careful about how we prompt or ask for the code. In this case, we pasted the dataset and prompted for Python code without adding the dataset description.

The following is the response given:

So in addition to the dataset, we add the dataset description. (Small tip: This is from the output from Example 1.)

And yay! We get some code to do our analysis. The following is the code:

import pandas as pd
import matplotlib.pyplot as plt

# load the dataset into a pandas dataframe
df = pd.read_csv("employee_data.csv")

# create a histogram to visualize the distribution of employee ages
plt.hist(df['age'], bins=10, color='purple')
plt.xlabel('Age')
plt.ylabel('Count')
plt.title('Age Distribution of Employees')
plt.show()

# create a bar chart to visualize the gender distribution of employees
gender_counts = df['gender'].value_counts()
gender_counts.plot(kind='bar', color='green')
plt.xlabel('Gender')
plt.ylabel('Count')
plt.title('Gender Distribution of Employees')
plt.show()

# create a bar chart to visualize the employee turnover rate
turnover_counts = df['turnover'].value_counts()
turnover_counts.plot(kind='bar', color='red')
plt.xlabel('Turnover')
plt.ylabel('Count')
plt.title('Employee Turnover Rate')
plt.show()

# create a bar chart to visualize the distribution of recruitment routes
recruitment_counts = df['recruitment_route'].value_counts()
recruitment_counts.plot(kind='bar', color='blue')
plt.xlabel('Recruitment Route')
plt.ylabel('Count')
plt.title('Distribution of Recruitment Routes')
plt.show()

# create a bar chart to visualize the distribution of coach assignments
coach_counts = df['coach'].value_counts()
coach_counts.plot(kind='bar', color='yellow')
plt.xlabel('Coach')
plt.ylabel('Count')
plt.title('Distribution of Coach Assignments')
plt.show()

# create a bar chart to visualize the distribution of head gender
head_gender_counts = df['head_gender'].value_counts()
head_gender_counts.plot(kind='bar', color='orange')
plt.xlabel('Head Gender')
plt.ylabel('Count')
plt.title('Distribution of Head Gender')
plt.show()

The code works and here are some of the resulting charts:

The code can always be improved to make better charts by adding more prompts. But for the initial analysis, it does a pretty good job.

Example 3: Setting up an API endpoint

In order to set up an API endpoint, we have to login to https://beta.openai.com/overview and go to account information, then view API keys. Once we have the API keys, we can use them in our code. ChatGPT can be prompted to generate code to perform analysis using the dataset and ChatGPT API.

Prompt: Can you generate Python code to input a given dataset and a question and use the ChatGPT API to do employee turnover analysis.

Result:

Note: You will need to replace "YOUR_API_KEY_HERE" with your actual OpenAI API key.

Of course this means that the dataset has to be input as text, just as in the first example, and there is a limit on the input size. The advantage of this is that we can customize it so that we present the input and output in the way we want. For example, we can use this to develop a web application that takes in text prompts and outputs the results of analysis that we want. We could even go further to allow input of a dataset in CSV format, then we extract the text and use it as input such that we are not limited to the same dataset. The application could also be used for other datasets, not just for employee turnover analysis.

OpenAI has a quickstart tutorial on how to build a web application in a few steps using Flask and OpenAI API for various tasks. You can also check out their examples page to see other tasks that can be done using their API.

Using the API is not entirely free. There is a pricing system and it is worth checking that out so that we use the options that work for us.

Conclusion

Of course ChatGPT does not provide the perfect analysis but it helps to make work easier. It helps to understand the data at hand and how to obtain value from it. Employee turnover analysis is a complex task that requires a comprehensive approach to solve. However, by using ChatGPT to process and analyze data on employee departures, organizations can gain valuable insights into the drivers of turnover and take steps to address its root causes. With its ability to process data and generate human-like text outputs, ChatGPT is a powerful tool for organizations looking to reduce turnover and improve employee satisfaction and engagement.

Did you enjoy the article and want to learn more? Here are some of the resources that I used:

ChatGPT. Available at https://chat.openai.com/chat/

Employee Turnover on Kaggle. Link -> https://www.kaggle.com/datasets/davinwijaya/employee-turnover

What does high turnover mean? Turnover rates, jobs, and causes. Available at https://www.aihr.com/blog/high-turnover-meaning-rates/