It’s very difficult to do people analytics without data. Finding and extracting workforce data to use for analytics is maybe the first and most common challenge that people analytics teams encounter. In this blog post, I’ll share tips I’ve learned about data extraction for HR teams, common challenges involved in extracting data, and best practices for overcoming these challenges. By applying these tips, HR teams can more effectively and efficiently extract data to drive business value and insights.
What is Data Extraction?
Data extraction is the process of extracting data from one or more sources and transforming it into a usable format for further analysis or processing. It is the "E" in "ETL". In the context of HR, data extraction is an essential process for collecting and organizing data related to the workforce, such as core HRIS, employee demographics, performance data, and engagement data. By extracting this data, HR teams can more effectively analyse and utilise it to make informed decisions and drive business value. Data extraction may involve extracting data from various sources, such as databases, spreadsheets, and HR systems.
This is the first in a series we're writing on the people data platform.
If you'd like to learn more,
Download the whitepaper.
Here are 5 Tips to Ensure HR Data Extraction Success
1. Prioritize and Align Extracted Data with the Needs of the Business
First and foremost, it is important for people analytics teams to prioritize what data they go after based on the needs and challenges of the business. If the business is experiencing high attrition, start with the HRIS data and build an analysis on termination trends. However, if the business is concerned about understanding remote work, the starting point for data extraction may need to be the survey system to get insights on employee voice back to leadership teams.
Delivering against critical business needs adds value to the company, builds trust, and creates the buy-in needed for future projects. There’s a time and a place to pursue novel data to generate insights that the business is not expecting, but without a foundation of trust and a history of delivering against core business concerns that can be a difficult road. When you’re building your data extraction roadmap, start with the data where you can get to value quickly.
2. Be Thoughtful About What You Extract
Workforce data is inherently different from other data in the company as underneath each data point is a coworker with a livelihood, career, friends and family, and personal details. It is critical that People Analytics teams be careful about what they extract and that they are thoughtful about use cases for the data. It’s an important ethical decision to make sure the data is private, secured, and safe in storage as well as in the extraction tools and pipelines that get the data into storage.
There are ethical approaches you should be thinking about, but we also live in an environment now where there are hard legal requirements related to the extraction and storage of workforce data. Depending on the nature of the data and where you operate, you may be required to comply with CPRA (California), SOX, HIPAA, and GDPR to name a few. Of note, GDPR applies to EU citizens wherever they reside and not just individuals residing in the EU. So if you employ any EU citizens or are considering hiring EU citizens, GDPR regulations are critical when it comes to data extraction.
3. Build the Business Case to Pull More
It can be difficult to convince IT teams or central data engineering functions to support HR data extraction. So when you do get someone to assist, there can be a certain anxiety around the idea of “what if I need more”. This can cause a team to over-extract data or pull too much of it too soon.
The feeling is understandable. I’ve been there. But as I’ve said before, the people analytics flywheel is a phenomenon that can be realised if you focus on prioritized business problems. This gives you the chance to revisit the data extraction conversation down the road should you need more. Your future arguments for data extraction will be stronger if business needs continue to be the rationale for additional requests for data extraction support.
4. Automate Your Extractions
A native report is a report that comes pre-packaged with your HR system. While native reports are helpful to early data extraction wins, they can be difficult to scale and standardise. Native reports tend to have the following effects.
- They are usually just a subset of the data within the system that are typically pulled through a graphic user interface, which makes them rigid and difficult to repeat.
- They are prone to time out if you pull too much data or pull too frequently.
- They may end up looking different depending on which user pulled them due to filters, permission settings, and the effective date range for the data pulled. (HR never closes the books!)
Over time, you’ll need to move away from native reports and to an API or another method to extract the data from the system. An API gets you access to the full data set, pulls data more frequently, and introduces standardisation and repeatability by leveraging data extraction tools and relying less on GUIs. APIs never get bored, can be logged and audited, and can run on their own. Automation changes repetitive and high-variance tasks into trusted processes.
5. Extract for Data Science, not just Reporting
See the video above to learn more about extracting data from Workday.
Meaningful analysis requires more data and often different data than snapshot extraction methods like native reports can provide. Snapshot extraction can handle basics, such as headcount reporting but cannot report what the company looked like on a given day. When you extract your HR data, make sure that you extract what you need for data science and not just your reporting needs.
Data science applications require wider data sets and more features. The time component is the most important part of HR data science. An employee might touch 10 different HR systems as he or she joins a company, so the data in each system needs to be joined to the same employee record in a harmonized and sequential order. Make sure that the data in each system is captured at the time of the action with the time stamp. Naturally this creates a “transaction-level” record.
Without those transaction records, you can end up with messy data. Examples include data that shows someone being promoted before they were hired or terminated before a transfer. HR is also notorious for back-dating work. Transaction-level records can prevent issues arising from those behaviors. Finally, your data science necessitates extracting the correct components.
Prioritise Data Extraction, But Be Aware of the Nuances
Are you ready to explore how to extract hr data at your company? Data extraction is an essential part of conducting people analytics. It is important for people analytics teams to prioritize their data extractions based on the needs and challenges of the business, be thoughtful about which data points are extracted, consider automating their data extractions, and be careful about the nuances of the data they extract.
Looking to Extract Data Out of Your Specific HRIS
Download our Resources Now!
Delivering People Analytics out of Workday
Delivering People Analytics from Successfactors