Series Archives | Data + People

Beginner’s Guide: Python for Analytics | Seaborn

Apr 19, 2018 | Data Visualization, People Analytics, Tutorials

Beginner’s Guide to Using Python with HR Data | Exploration Series

Part Three – Seaborn

In this first tutorial series, I’m exploring the IBM HR Attrition and Performance data set. This is a great data set used to demonstrate the possibilities from using machine learning and other data science techniques.

Now we’ll move on to using Seaborn for our visualizations. The benefit of Seaborn is it continues to abstract the complex, underlying calls to visualize your data – further allowing you to focus on your analysis task and not having to think about how to implement what you want to do. It goes even further and provides built-in functionality that would be incredibly complex to implement without the benefit of Seaborn.

Series Outline

0: basic operations & summary statistics

1: matplotlib

2: pandas visualization

3: seaborn

4: plotly

5: series summary

3: Seaborn

view on github

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

Viewer requires iframe.

view raw

ibm_hr_data_w_python_3.ipynb

hosted with ❤ by GitHub

Credits
Photo: Photo by Randall Ruiz on Unsplash

Beginner’s Guide: Python for Analytics | Pandas

Apr 9, 2018 | Data Visualization, People Analytics, Tutorials

Beginner’s Guide to Using Python with HR Data | Exploration Series

Part Two – Pandas

In this first tutorial series, I’m exploring the IBM HR Attrition and Performance data set. This is a great data set used to demonstrate the possibilities from using machine learning and other data science techniques.

Next, we’ll take a look at the power of Pandas to plot our data. As a budding data [analyst/scientist/enthusiast], Pandas has become my most common import and tool. Plotting directly from pandas objects makes it very easy to stay in the flow of analyzing data. Let’s get going.

Series Outline

0: basic operations & summary statistics

1: matplotlib

2: pandas visualization

3: seaborn

4: plotly

5: series summary

2: Pandas

view on github

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

Viewer requires iframe.

view raw

ibm_hr_data_w_python_2.ipynb

hosted with ❤ by GitHub

Beginner’s Guide: Python for Analytics | Matplotlib

Apr 7, 2018 | Data Visualization, People Analytics, Tutorials

Beginner’s Guide to Using Python with HR Data | Exploration Series

Part One – Matplotlib

In this first tutorial series, I’m exploring the IBM HR Attrition and Performance data set. This is a great data set used to demonstrate the possibilities from using machine learning and other data science techniques.

In this next walkthrough, we’ll begin to ‘see’ our data through the use of visualization packages. In R there are 3 commons plotting tools, and other packages extend these main items. In Python, there is Matplotlib, and most other packages build on this foundation. So, the decision of where to start with Python plotting is an easy one – let’s get going.

Series Outline

0: basic operations & summary statistics

1: matplotlib

2: pandas visualization

3: seaborn

4: plotly

5: series summary

1: matplotlib

view on github

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

Viewer requires iframe.

view raw

ibm_hr_data_w_python_1.ipynb

hosted with ❤ by GitHub

Beginner’s Guide: Python for Analytics | The Basics

Mar 14, 2018 | People Analytics, Tutorials

Beginner’s Guide to Using Python with HR Data | Exploration Series

Part Zero – The Basics

In this first tutorial series, I’m exploring the IBM HR Attrition and Performance data set. This is a great data set used to demonstrate the possibilities from using machine learning and other data science techniques.

I’ll be back with tutorial posts that walk through how to apply more advanced techniques to generate predictive and prescriptive insights from the data. But that’d be jumping ahead. First, the basics. Exploratory Data Analysis, or EDA.

It’s often tempting to jump right in and try to find the most advanced insight possible. When I’m in the process of learning something new, it’s my first instinct to begin applying it straight away, skipping the basics. Eventually, I’ll stumble; and it’s always something I could have avoided by simply spending a little bit of time really understanding the data I have.

To properly analyze data, you must understand it. Is it complete (missing values), are the errors (values out of normal bounds – is this correct), and generally what information is contained within the data? Depending on where the request is coming from in a work-context, you may not control the data, so what you get is what you have; it’s often much easier when you’ve pulled your own data – it’s just not always possible, or even smart to do so.

Always begin with an exploration of your data. In this tutorial, I’m digging out my current favorite tool – Python. If you’ve never programmed, if Excel still frightens you a bit, or you’re firmly in the R camp – read on; this series will show the possibilities while exploring 5 different packages and interpreting and understanding data.

Series Outline

0: basic operations & summary statistics

1: matplotlib

2: pandas visualization

3: seaborn

4: plotly

5: series summary

Beginner’s Guide: Python for Analytics | Seaborn

Beginner’s Guide to Using Python with HR Data | Exploration Series

Part Three – Seaborn

Series Outline

3: Seaborn

view on github

Beginner’s Guide: Python for Analytics | Pandas

Beginner’s Guide to Using Python with HR Data | Exploration Series

Part Two – Pandas

Series Outline

2: Pandas

view on github

Beginner’s Guide: Python for Analytics | Matplotlib

Beginner’s Guide to Using Python with HR Data | Exploration Series

Part One – Matplotlib

Series Outline

1: matplotlib

view on github

Beginner’s Guide: Python for Analytics | The Basics

Beginner’s Guide to Using Python with HR Data | Exploration Series

Part Zero – The Basics

Series Outline

0: basic operations & generating summary statistics

view on github

Popular Posts

HR Data Isn’t Big Data

Top 5 People Analytics Conferences in 2018

HR Reporting vs. People Analytics

Social