Welcome!

Eclipse Authors: Pat Romanski, Elizabeth White, Liz McMillan, David H Deans, JP Morgenthal

Blog Feed Post

R Helps With Employee Churn

by Joseph Rickert Pasha Roberts, Chief Scientist at Talent Analytics, is writing a series of articles on Employee Churn for the Predictive Analytics Times that comprise a really instructive and valuable example of using R to do some basic predictive modeling. So far, Pasha has published Employee Churn 201 in which he makes a case for the importance of modeling employee churn, and Employee Churn 202 where he builds a fairly sophisticated interactive model from first principles using only RStudio and basic R functions. And, while the series is not even complete, I think that is is going to be unique because it is working well on multiple levels. In Churn 201, Pasha uses R almost incidentally, to produce the following plot that illustrates the concepts involved in understanding the costs and benefits contributed by a single employee.    At the lowest level, this is a nice example of what might be called a “programming literate essay”. R clearly isn’t necessary just to build create a graphic. (Note the use of ggplot's annotate() capability.) But, if you look at the R code behind the scenes, you will see that Pasha has gone a bit further. In a few lines of annotated code he has sketched out a self-documenting model that someone else could use to get “back of the envelope” results for their business. The exercise is roughly at the level of what a business analyst might attempt in an Excel spreadsheet. In Employee Churn 202, Pasha goes still further moving the series from essays alone to a modeling effort. He uses basic survival analysis ideas and simple R functions to create a sophisticated decision model that computes several performance measures including something he calls Expected Cumulative Net Benefit. This measures the net benefit to the corporation employees who leave for both “good” and “bad” reasons. The following figure shows the simulation running in RStudio complete with interactive tools built with the manipulate() function to perform "what if" analyses and display the results. Running the simulation is easy. All of the code is available on Github where the file, churn202.md, provides details on how things work. Once you have run the code in the churn202.R or issued the command source("churn202.R") from the console, running the function manipSim202() will produce the simulation. (Note that might be necessary to click on “gear” icon in the upper left hand corner of the plots panel to have the slide bar controls appear.) The function runSensitivityTests()varies each of the parameters in the simulation through a reasonable range of values, while holding the other parameters fixed, to show the sensitivity of Expected Net Chumulative Benefit to each parameter. The function runHistograms() produces histograms of the synthetic data that drive the simulation and hints at the data collection effort that would be required to run the simulation for real. By placing the code on GitHub and inviting feedback, comments and pull requests Pasha has raised his literary efforts to the status of an open source, employee churn project without comprimising the clarity of his exposition. I, for one, am looking forward to the rest of thes series.

Read the original blog entry...

More Stories By David Smith

David Smith is Vice President of Marketing and Community at Revolution Analytics. He has a long history with the R and statistics communities. After graduating with a degree in Statistics from the University of Adelaide, South Australia, he spent four years researching statistical methodology at Lancaster University in the United Kingdom, where he also developed a number of packages for the S-PLUS statistical modeling environment. He continued his association with S-PLUS at Insightful (now TIBCO Spotfire) overseeing the product management of S-PLUS and other statistical and data mining products.<

David smith is the co-author (with Bill Venables) of the popular tutorial manual, An Introduction to R, and one of the originating developers of the ESS: Emacs Speaks Statistics project. Today, he leads marketing for REvolution R, supports R communities worldwide, and is responsible for the Revolutions blog. Prior to joining Revolution Analytics, he served as vice president of product management at Zynchros, Inc. Follow him on twitter at @RevoDavid

IoT & Smart Cities Stories
Bill Schmarzo, Tech Chair of "Big Data | Analytics" of upcoming CloudEXPO | DXWorldEXPO New York (November 12-13, 2018, New York City) today announced the outline and schedule of the track. "The track has been designed in experience/degree order," said Schmarzo. "So, that folks who attend the entire track can leave the conference with some of the skills necessary to get their work done when they get back to their offices. It actually ties back to some work that I'm doing at the University of San...
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things'). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing. IoT is not about the devices, its about the data consumed and generated. The devices are tools, mechanisms, conduits. This paper discusses the considerations when dealing with the...
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: Driving Business Strategies with Data Science," is responsible for setting the strategy and defining the Big Data service offerings and capabilities for EMC Global Services Big Data Practice. As the CTO for the Big Data Practice, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He's written several white papers, is an avid blogge...
Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more busine...
If a machine can invent, does this mean the end of the patent system as we know it? The patent system, both in the US and Europe, allows companies to protect their inventions and helps foster innovation. However, Artificial Intelligence (AI) could be set to disrupt the patent system as we know it. This talk will examine how AI may change the patent landscape in the years to come. Furthermore, ways in which companies can best protect their AI related inventions will be examined from both a US and...
Enterprises have taken advantage of IoT to achieve important revenue and cost advantages. What is less apparent is how incumbent enterprises operating at scale have, following success with IoT, built analytic, operations management and software development capabilities - ranging from autonomous vehicles to manageable robotics installations. They have embraced these capabilities as if they were Silicon Valley startups.
Chris Matthieu is the President & CEO of Computes, inc. He brings 30 years of experience in development and launches of disruptive technologies to create new market opportunities as well as enhance enterprise product portfolios with emerging technologies. His most recent venture was Octoblu, a cross-protocol Internet of Things (IoT) mesh network platform, acquired by Citrix. Prior to co-founding Octoblu, Chris was founder of Nodester, an open-source Node.JS PaaS which was acquired by AppFog and ...
The deluge of IoT sensor data collected from connected devices and the powerful AI required to make that data actionable are giving rise to a hybrid ecosystem in which cloud, on-prem and edge processes become interweaved. Attendees will learn how emerging composable infrastructure solutions deliver the adaptive architecture needed to manage this new data reality. Machine learning algorithms can better anticipate data storms and automate resources to support surges, including fully scalable GPU-c...
Cloud-enabled transformation has evolved from cost saving measure to business innovation strategy -- one that combines the cloud with cognitive capabilities to drive market disruption. Learn how you can achieve the insight and agility you need to gain a competitive advantage. Industry-acclaimed CTO and cloud expert, Shankar Kalyana presents. Only the most exceptional IBMers are appointed with the rare distinction of IBM Fellow, the highest technical honor in the company. Shankar has also receive...