Beginner’s Guide: Python for Analytics | Matplotlib

Beginner’s Guide: Python for Analytics | Matplotlib

Beginner’s Guide to Using Python with HR Data | Exploration Series

Part One – Matplotlib

In this first tutorial series, I’m exploring the IBM HR Attrition and Performance data set. This is a great data set used to demonstrate the possibilities from using machine learning and other data science techniques.

In this next walkthrough, we’ll begin to ‘see’ our data through the use of visualization packages. In R there are 3 commons plotting tools, and other packages extend these main items. In Python, there is Matplotlib, and most other packages build on this foundation. So, the decision of where to start with Python plotting is an easy one – let’s get going.

Series Outline

0: basic operations & summary statistics

1: matplotlib

2: pandas visualization

3: seaborn

4: plotly

5: series summary

1: matplotlib

 

view on github


Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

5 Reasons Not to Use Excel for People Analytics

5 Reasons Not to Use Excel for People Analytics

Chief Financial Officers are now demanding their teams stop using Excel. While your C-level executives may not be demanding this of you, there are very good reasons to consider alternatives. If Finance is ready to abandon Excel, HR should certainly make the jump. Seriously, have you ever seen what a Financial Analyst builds in Excel? It’s like the car accident that people just can’t stop staring at.

Here’s my Top 5 reasons to use something other than Excel for your Data Analysis work.

 

1 Excel doesn’t do Big Data

Excel tops out at 1,048,576 rows. I believe that the majority of HR departments do not have Big Data… yet. To HR generally, ~1 million rows may feel like huge data, but it does not meet today’s definition of Big Data. In fact, that’s no where close.

Excel supports 16,834 columns in a worksheet. Personally, I’ve never seen any data nearly as wide as 16,000+ columns – and I never, ever want to.

I’m willing to wager a large sum that your HR data is not going to come in a wide format, but rather a long one. When your data is in a long format, even HR data  of a mid-sized organization will surpass the ~1 million row limit.

Headcount is a simple example. Let’s consider a few reasonable examples and see when we max out of Excel.

  • Assume you have 40,000 active employees. If you have 25 years of history, you’ll have hit your limit.
  • Assume you have 10,000 employees, but you want to look at this on a monthly basis. You’ll only get 8 years worth of data in Excel.

Yes, you could of course pre-process some of the information. You could have your HCM aggregate and deliver the data. This is certainly reasonable, and even advisable in certain situations. But when you want to slice your information multiple ways – by gender, department, job level – each of those is a separate request for data. Most data analysis and visualization tools work best with granular data, that you control the various aggregations from. I’ve never found a case where I didn’t benefit more from having more granular-level information. Oh, except for when using Excel…

2 I don’t like Excel graphing.

Honestly, I hate Excel graphs. This is my least favorite part of using the software. I feel like a data visualization failure when I try to make a decent graph. I can perform advanced table calculations in Tableau, build interactive Python and R visualizations, and write complex database queries; yet I can’t manage a decent bar graph in Excel. That’s only a slight exaggeration.

Granted, I’ve never put in the time to really master Excel graphing. But I’ve no motivation to. It’s complex, limited, and I’ve already found many better options. Why torture myself further? I’ve seen the light, and it’s glorious outside of Excel.

3 endless calculating


'Calculating 4 processors...'

Oh. my. gosh.

The amount of time I’ve suffered through Excel crunching data. Literally crunching data; leaving my work laptop sounding like it’s grinding something internally. And all I did was add a formula and apply it to the colu… *computer promptly stops responding*.

That’s all it takes to lose your Tuesday afternoon to a seemingly endless cycle of calculations. There are websites and blogs dedicated to speeding it up. I say it’s faster to not use Excel at all.

4 repeatability

It’s nearly time for the big presentation… just one final tweak … and, No!, No, no, no; nooooooooo! Yes, Excel has crashed again. You’re left scrambling to recover your workbook.

Sheets get deleted. Formulas are altered. New data is added.

Furthermore, for those among us that love to build reports and dashboards in Excel – just watch when their manager asks for the most minor of cosmetic layout alterations. Their face says it all “You just added 8 hours of unmerging, moving, and resizing 4,000 cells because of your request.”

5 accuracy

  • $6 billion. That’s the amount of money JP Morgan Chase lost in 2012, in large part due to Excel errors.
  • 88%. That’s the amount spreadsheets found to have human errors present. Nearly 9 out of 10.

Those numbers likely speak for themselves. Excel has a feature ‘paste as values’. I use it when I want to avoid the dreaded ‘Calculating…’ The downside – there’s absolutely zero evidence of the work. You could record macros, but good luck making quick changes to a macro. If you can do that, I’ll imagine that you’re already writing code elsewhere as well.

 

Alternatives

There are countless alternatives. Your choice depends on what you aim to accomplish, what you may already know, and what you can afford.

Open-Source Languages:

Other options:

Each of these has it’s pros/cons. Open-source languages have endless possibilities, but you’ve got to learn to code. Tools such as Tableau and QlikView can cost thousands per license.

Results matter most

I’ll be honest, you can’t, and probably shouldn’t avoid Excel entirely. There’s a right tool for every job. There are jobs that Excel is great, maybe even perfect for.

I hope you’ll check out some of these, keeping an open and curious mind. Check out some of my Tutorials, I hope to convince you through examples more than my words.

There’s also this: the best tool is the one you use.

photo credits
stop: Photo by Bethany Legg on Unsplash

midnight clock: Photo by Loic Djim on Unsplash

dog: Photo by Matthew Henry on Unsplash

Beginner’s Guide: Python for Analytics | The Basics

Beginner’s Guide: Python for Analytics | The Basics

 

Beginner’s Guide to Using Python with HR Data | Exploration Series

Part Zero – The Basics

In this first tutorial series, I’m exploring the IBM HR Attrition and Performance data set. This is a great data set used to demonstrate the possibilities from using machine learning and other data science techniques.

I’ll be back with tutorial posts that walk through how to apply more advanced techniques to generate predictive and prescriptive insights from the data. But that’d be jumping ahead. First, the basics. Exploratory Data Analysis, or EDA.

It’s often tempting to jump right in and try to find the most advanced insight possible. When I’m in the process of learning something new, it’s my first instinct to begin applying it straight away, skipping the basics. Eventually, I’ll stumble; and it’s always something I could have avoided by simply spending a little bit of time really understanding the data I have.

To properly analyze data, you must understand it. Is it complete (missing values), are the errors (values out of normal bounds – is this correct), and generally what information is contained within the data? Depending on where the request is coming from in a work-context, you may not control the data, so what you get is what you have; it’s often much easier when you’ve pulled your own data – it’s just not always possible, or even smart to do so.

Always begin with an exploration of your data. In this tutorial, I’m digging out my current favorite tool – Python. If you’ve never programmed, if Excel still frightens you a bit, or you’re firmly in the R camp – read on; this series will show the possibilities while exploring 5 different packages and interpreting and understanding data.

Series Outline

0: basic operations & summary statistics

1: matplotlib

2: pandas visualization

3: seaborn

4: plotly

5: series summary

0: basic operations & generating summary statistics

 

view on github

 

Cost of Employee Turnover Calculator

Cost of Employee Turnover Calculator

The Real Cost of Employee Turnover

Years ago when first starting to report on HR metrics, naturally the first item is employee turnover. Curious about the real impact of turnover I began searching for figures. What I found certainly surprised me.

More surprising was the result of me casually sharing with some colleagues, and how quickly they took the high end of the range and applied to all positions, from the CEO down to the shop worker. 250% of the annual salary became the figure folks in certain departments would trot out when it served their needs.

More surprising than the number was the reaction of colleagues when sharing the range of costs. The surprise was in how they used the information. If I gave a range of ‘between 10 and 250 percent of annual salary’, I quickly learned the takeaway from this was ‘250%’. From the CEO down to the shop worker, 250%.

Quickly, I learned I had to be much more clear in the cost projection. What I never found back then was a simple, flexible tool for calculating turnover. You could of course build your own, but why bother if someone has already done the work!

Once you find your number – check out Why to Use People Analytics for beginning thoughts on how to apply People Analytics, and make a real impact reducing cost – by reducing employee turnover.

 

Why Use HR People Analytics?

Why Use HR People Analytics?

The Benefits of People Analytics

People spend the majority of their time, and life, working. People Analytics provides the opportunity to make work better. Laszlo Bock, former SVP of People Operations at Google, makes this point time and again in his book Work Rules. It is estimated that in the average lifetime, each person will spend 90,000 hours at work. Those of us serving and working in HR capacities have the awesome responsibility of making the most of our ‘human resources’ – people. Doing this well serves the individual and the organization. These are not, and do not need to be, mutually exclusive.

How HR Can Use Data

Data has always been used in HR to make decisions. Has your organization hired someone without interviewing them? What about reviewing their resume, references, and background checks? Not likely.

Using data to inform and guide decisions is the purpose of People Analytics. HR professionals are using information more than they give themselves credit for, more than they are comfortable with. Traditionally, HR prefers intuition to evidence.

CEOs no longer want an HR department sitting around waiting to enforce policy. As business becomes evermore competitive, HR must contribute meaningfully to the bottom line. To do so, it has to bring data to it’s decision-marking.

Can your organization ask and answer the following:

  • How effective was the last training program?
  • Which employees deliver the greatest revenue?
  • Who is going to leave your company next?
  • Who are the important connections in your organization?

You may have heard that humans only use 10% of their brains (spoiler-alert: not likely true), but think about your HR data – how much of this information is being utilized?

The Growth of People Analytics

What if…

    • you could know the Top Talent associates in your organization who are most likely to leave in the next 12 months?
    • you could boost productivity of your entire workforce, increase engagement, lower healthcare costs, improve the ability of managers to lead effectively…

Google Trends data for “People Analytics” and “HR Analytics”

Until now, it has been fine to let the Google’s of the world run their advanced analytic functions with massive budgets and multiple PhD’s. You may have a year or two to continue to bury your head in the sand and hope this “fad” passes. Or, you could not do that.

People Are Expensive

“Our people are our most valuable asset.” – every leader ever.

Human Resources is about the effective deployment of resources. It only makes sense to use data to help determine the optimal ways to use an organization’s resources.

Employee Turnover

The cost of turnover is unbelievable. Large organizations lose millions and millions of dollars annually as a direct result employee turnover.

The numbers speak for themselves. I’ve created a calculator you can use to see the bottom-line you can make from your seat in HR.

Imagine taking that number to your manager and a plan for how to reduce that expense?

For more advanced and granular control, use our Advanced Turnover Cost Calculator. Both are based on the calculation set forth by SHRM.

Moneyball for Human Resources

The C-Suite is most interested in activities and investments that drive the bottom-line. The promise of People Analytics is what CEOs and CFOs have been hoping HR can deliver for their organizations. HR can elevate above compliance and policy management.

People Analytics is not a panacea. It is data-driven decision making. Leaders of every business make informed decisions. HR can now make better-informed decisions and drive bottom-line revenue.

 

 

Photo by Clark Tibbs on Unsplash

What is HR People Analytics?

What is HR People Analytics?

Getting Started is Hard

You’ve heard of People Analytics and you’ve an idea, generally, of what it is. So now you’re out there searching for more about it, how to do it, and looking for ideas of where to start. I had the same troubles. Coming from a medium-sized company with a rapidly maturing reporting function, but no advanced analytics function at all, I had plenty of energy but was short on know-how. This serves to document my own learning while aiming to help – and even learn from – others.

Defining People Analytics

Spend enough time looking and you’ll find as many definitions and approaches as sites you’ve visited. My experiences have led me to believe that how you apply data to your people decisions is People Analytics.

Over time, your skills and your function will advance. As the months pass, you’ll evolve your definition. For now, your what feels right to you. It’ll change, be good with that. Here’s one I like to use:

People are hard. Data can help. Data can make things better for people. Let’s make things better.

 

People Analytics Ain’t New

If we weren’t talking about something in the domain of HR I’d have to exclaim that we’ve been duped with a fresh coat of paint and a new name straight out of Silicon Valley. But really, this has been going on for a long time, just out of the spotlight that’s now been shone this direction.

I love history. Here are some fascinating examples proving People Analytics is something that’s been used for decades, with some new packaging and much fancier tools.

Frederick Taylor

In The Principles of Scientific Management, Taylor writes about his study and formation of what he referred to as “Scientific Management”, largely developed during his time at the Bethlehem Steel Company. Taylor devised scientific methods for selection, training, and development of workers. His analysis concluded that when the right workers were selected and trained, that a wage of up to a 60% percent premium compared to the local market average, would “become more sober, and work more steadily.” Further, he found that when paid more than a 60% premium, the workers output became “irregular”. Principles of Scientific Management

Walter Dill Scott

During World War I, selection and placement of servicemen was critical to the success of the war effort, yet it proved difficult and inefficient. Scott devised an evaluation to rate the potential success of each service member. When this evaluation was rejected, Scott took a challenge to run the test against already successful officers. When his results matched what commanding officers already knew, the tests were quickly implemented throughout the Army. These tests were modified from the tests he wrote about in Increasing Human Efficiency in Business where he applied them first to job applicants. His study extended to assessments to evaluate candidates for promotion and to match skill sets with unique positions that required them. He was awared a Distinguished Service Award for his contributions.

Google

Google rightfully receives and deserves much of the credit for the modern movement to bring data into HR. They just weren’t first.

Way back in 2007 Google was beginning to apply what their business centered around, data, to their people. One of the first projects was to figure out why new mothers left at nearly twice the average rate. Since then, Google has applied People Analytics to problems including diversity, hiring, leadership, workplace design, and retention.

Reading Work Rules! by Laszlo Bock for me was like reading Dan Brown’s The DaVinci Code – a guilty pleasure I (finally admit publicly) just couldn’t put down. I was hooked, inspired, and eager for the weekend to hurry up and end so that I could get to work and get started. (Hey, boss!)

Present Day

Where Google was in 2007 is still where some companies would like to be at 10 years later. Google was a pioneer in the modern age, but they’ve shone a light on what’s possible, and they aren’t shy in sharing. I to this day continue to re-read some of what Google has done. That team has been amazing, and true to their mission, they have made access to this information universal; truly a first considering this is HR.

It’s amazing to think of how the scale of data has grown from where Frederick Taylor started. Think of the data that organizations are sitting on. Mountains! It’s likely messy, and it may not be all the data you’ll ever need (it’s not – you’ll get there), but it’s more than you can handle right now.

Your organization’s HRIS, ATS, LMS and every other acronym/system have years or decades of information. Performance reviews – check. 9 boxes or whatever you’re calling that – check. On and on. You’ve got many datas.

Start including social networks, email, chat, personal trackers, phones… As you start to think of how to reign in, and ultimately process all of this, you’ll feel like I do about it: a bit unsure, but damn excited to try.

Data-driven decisions

Ask yourself:

  • Would you hire someone without interviewing them?
  • Would you make them an offer without bench-marking compensation for their market and experience?
  • Would a Marketing department blindly spend ad dollars without first determining who/where their customer is?

Not likely. Honestly, I hope not. Just don’t do that.

There are reasons HR hasn’t implemented a data-driven approach. Notice I didn’t say good reasons, mostly just reasons (alright – there are a few good reasons, I’ll come to this). 

To start, these aren’t skills you find in a HR department. HR professionals are very talented, and necessary – I want to be clear about that. But they rely on soft skills. Data is viewed as hard numbers. In school I always preferred math to English. With math, I knew whether I got the answer; I was right or I wasn’t. English just left me confused. Yet, professionally I found my way into HR. Fortunately, math is coming to my rescue.

For the “old guard” in HR departments, data-driven decisions is scaring them. There’s a fear among some that the machines are coming to take their jobs, and that the algorithm makes the final decision. While I generally leave the definition of People Analytics to you, my recommendation is that it is more than an algorithm applied to people. Algorithms will discriminate, they are biased, and they can be wrong. Well, the math can be wrong, but algorithms don’t get culture. Maybe they will and maybe we’ll be living on Mars. Anything is possible; maybe just not likely.

The truly cool part about what analytics provides is a evidenced-based approach to the intuition that HR has historically provided. They can and do work well together. It’s a marriage. Like any marriage it’ll take some effort. You’ll have to find how to make it work. You’ll have to listen and grow together. Good things are worth the effort. Read Why Use People Analytics for more.

“Most of the world will make decisions by either guessing or using their gut. They will be either lucky or wrong.” Suhail Doshi

People Analytics is a Journey

There are many approaches to People Analytics, and varied potential applications. There is no standardized function, no road map, and no “one size fits all” approach. Getting started can often be the most difficult part of the journey, it’ll feel like your pushing uphill for the first bit. Akin to the launch of the space shuttle, an incredible amount of thrust can be required to put in place an analytics-based approach within HR. This is becoming less the norm, but still an honest assessment of what may lie ahead.

It’s a journey. Every journey begins with the first step. Do not worry about what others say or think. Some will say reporting isn’t analytics. I somewhat agree, but I much more disagree. If it’s the first time you’re seeing some information, and that information provides solid evidence to support a decision, who cares if it took a PhD or Excel? If your people are better off, you’ve delivered.

Your first effort in People Analytics may be cobbling together headcount and attrition across your organization, broken down by department or levels. The more advanced and mature practitioners are well-beyond this, but everyone began with their first step, the first project that led to their first insight. It may be as straightforward as showing at what point in the job level hierarchy do the greatest percent of associates exit. Is it your managers, your 4th-year associates who haven’t been promoted? That’s insight. The journey has begun.

Now is the Time

People Analytics – the name might be new, but the concept is not. You could argue the stakes have never been higher – “the war for talent” is a constant focus and challenge for HR. Despite the stakes being higher than ever, the barrier to entry is far lower than it was when Google booted up their People Operations team over 10 years ago. The tools, techniques, and support for People Analytics has grown tremendously since that time. Whether you’re just getting started, or continuing to mature in your People Analytics journey, the time has never been better. The spotlight in HR is on you.

Oh, and it’s a helluvalot of fun… 

 

 

Photo by Samuel Zeller on Unsplash