14 Traits Of The Best Data Scientists- Valutrics
Actual data scientists are in high demand, and there’s not enough of them to go around. If you want identify the right talent, consider these tips.
If your company is trying to hire a data scientist, proceed with caution. Given the shortage of data science talent, more candidates are assuming the title hoping to command a higher salary. Actual data scientists are much harder to find, and they’re harder to keep because they’re in high demand.
“The way I define a data scientist is somebody who knows programming better than a statistician and more statistics than a programmer. Both of those traits are table stakes,” said Anthony Goldbloom, cofounder and CEO of data science competition platform Kaggle, in an interview.
Business domain knowledge is also important, since data scientists need to understand the problem they’re solving and its context. Increasingly, organizations recruiting data scientists are also looking for machine learning experience, since the capability is necessary to keep pace with data growth, particularly with the addition of IoT devices. Data scientists should also be, but aren’t always, effective communicators.
“You have to understand how to talk to people in a way that’s simple and comprehensible to them while maintaining accuracy,” said Alexander Isakov, CEO of business data solutions and strategy firm Pallantius. “CEOs and senior management don’t care if we use a random forest or Oracle Delphi. As long as we clearly explain what’s going on and how to make it actionable.”
Everyone wants to hire unicorns — those rare beings who are equally good at math, statistics, computer science, domain knowledge, communication, and perhaps machine learning. Since hiring a unicorn is difficult at best, organizations need to make compromises. They need to be mindful of the compromises they’re making and why.
“You need to start by answering this simple question: ‘What problem are you trying to solve?’ Once you know what your business goal is, you can start both looking for the talent you need and the right tools,” said Judith Hurwitz, president and CEO of consulting firm Hurwitz Associates.
If you’re considering hiring a data scientist, why not consider building a data science capability? It may well be a wiser long-term strategy.
“If you’re adding talent, then you want to be very conscious of building a rounded team,” said Wilds Ross, principal of data and analytics at audit, tax, and advisory service firm KPMG.
“You want to have some statistics, data engineering, optimization, [and] computer science. You’re going to want to define what your objectives are in deploying a data science team in your organization and decide what we’re really going to work on in the business to improve.”
The best data scientists have a few traits and qualifications worth noting. We’ve identified some of them in the next pages. What would you add to our list?
They’re Curious
Curiosity can be a blessing or a curse. A data scientist can be so distracted by the details of the problem that she or he loses sight of the big picture. Conversely, curiosity is a virtue when it results in creative problem-solving, innovation, and discovery.
“Curiosity is one of the hallmarks I look for because [data science] is hard. You’re going to try a bunch of different things. The data isn’t going to be perfect, and the systems you’re going to be operating on and the client are likely not going to be perfectly ready to perform high-end data science,” said Mark Jacobsohn, SVP and leader of the analytics and data science business at strategy and technology consulting firm Booz Allen Hamilton, in an interview. “You really need the curiosity to keep charging through and the tenacity to keep at it when you run into several brick walls. Curiosity and tenacity are necessary to become a great data scientist.”
They’re Tenacious
The best data scientists are very determined. They don’t like sacrificing the quality of their work, and when they’re faced with trade-offs that could potentially impact the business, they tend to ensure that business leaders understand what those trade-offs are.
“Grit and determination are important. It’s hard to do this stuff. It’s hard to learn these techniques. It’s hard to learn programming languages. Even though regression [modeling] may appear to be easy, it’s kind of hard to figure out the ins and outs of it and apply it to real-world situations. It’s hard to wrangle data, which is a necessary requisite. It takes a willingness to stick with it and get it right,” said James Guszcza, chief data scientist at audit, consulting, tax, and advisory services firm Deloitte US.
They’re Creative
Many data scientists participate in Kaggle competitions off-hours because they want to learn new skills, meet new people who can help expand their thinking, or apply their skills and knowledge to problems they haven’t solved before. Winning a Kaggle competition is a badge of honor that carries weight on a curriculum vitae (CV) and within the formidable Kaggle community.
“Often, what wins a Kaggle competition isn’t the fanciest math. It’s the creative ways to look at the data,” said Anthony Goldbloom of Kaggle in an interview. “We had a competition for a group of car dealers who wanted to understand which cars at a second-hand auction would be good buys and which would be lemons. It turned out that the color of the car had a huge impact. That’s not fancy mathematics. It’s a fancy way to slice and dice the data.”
They’re Clever
A good data scientist is able to see things others can’t see because he or she has the knowledge, tools, and methods necessary to do so. Sometimes the result changes the status quo.
For example, Roger Craig, a data scientist at cognitive computing company Digital Reasoning won six times on the Jeopardy! quiz game on TV, as well as the Tournament of Champions, by using a collection of 20+ years’ worth of Jeopardy! questions and answers that had been transcribed, analyzing them, and developing a method others could use to improve their chances of winning the game. In addition to his wins, Craig set a one-day earnings record of $77,000.
“There was this website with all the data there. The IBM Watson team discovered it about the same time, and that’s how they started to build Watson,” said Craig. “I was in grad school doing machine learning and natural language processing, so I started playing with the data. It made it all quantifiable and I had historical data of Ken Jennings’ 70+ games [so] I could compare myself to him. There was a lot of luck involved too.”
They’re Passionate
A responsible data scientist has a passion for interrogating the data so she or he can draw reliable conclusions from it. Often, this person will work long hours on difficult problems because she or he is driven to solve them.
“The best data scientists I’ve worked with are those [who] are always asking why. They’re not just asking why about the data, they’re asking why about everything,” said Ryan Sullivan, CEO of research and consulting firm Intensity, in an interview. “It’s somebody with the consistent persistence, the intensity, the engagement, the passion for data science. They’re going to figure it out, and a lot of data science is just that.”
They Love Learning
Good data scientists are lifelong learners. They have to be, because change is a constant in their world. Tools and technologies evolve, corporate strategies shift, markets change, and there are always a newer and better methods that have been developed to further optimize outcomes.
“This is a fast-moving field,” said James Guszcza of Deloitte US, in an interview. “It takes a willingness to learn new things. They have the drive and motivation to sit down, maybe pick up a book, and learn continuously. Some of the most effective colleagues I’ve ever had love learning.”
They’re Human
Most actual data scientists aren’t actually unicorns. They have some mix of math, statistics, computer science, machine learning, and business domain knowledge, but they tend not to be equally good at everything. That fact can and should impact recruiting strategies.
“People looking for a unicorn may end up compromising the wrong area. If we have to compromise, we’ll compromise in the areas that are easy to complement and not compromise on — the core statistical skills,” said Goutham Belliappa, principal big data and analytics leader at consulting, technology, and outsourcing services company Capgemini North America. “For us, it’s either a PhD, hopefully, or a master’s of statistics, hopefully in some specific field.”
They’re Not All Alike
There is considerable diversity among data scientists, as one might expect. Some are extroverts, some are introverts. Some specialize in biotech, while others specialize in retail or financial services. Some are good at leading teams, others would prefer to work in solitude. What they have in common is a core set of skills and most likely a graduate degree in math or statistics.
“We’ve seen data analysts trying to position themselves as data scientists,” said Blake Angove, director of technology recruiting services at temporary staffing and recruiting services firm LaSalle Network, in an interview. “An actual data scientist is going to have a broad background in several areas [including] software engineering, statistics, [and] machine learning. Calculus and linear algebra are a big part of it. Being able to [use] data visualization software. They should be strong in statistical programming with R or Python, and education is big. Typically, data scientists will have at least master’s degrees. A lot of the best ones are at the PhD level.”
They’re Humble
The personalities of data scientists differ, just like anyone else. One important virtue is to have some level of humility when problem-solving because unexpected results happen and not every hypothesis is confirmed.
“[One of] the most important qualities for a data scientist at Etsy [is] humility,” said Mike Morgan, engineering director at handmade marketplace Etsy. “Humility enables a data scientist to understand the limits of their knowledge and seek to learn more. To be comfortable not knowing the answer, or not understanding something, is the first step in scientific exploration. A good scientist meets this moment with joy, not frustration or embarrassment.”
Specific Experience May Matter More Than Years
The misalignment of data scientists and the organizations that hire them often stems from poorly written job requisitions or the failure to consider what kind of talent is actually needed and why. Sometimes, both employers and candidates overstate requirements or capabilities in unrealistic terms. For example, consider someone who claims to have 10 years of Hadoop experience: Even if the years of experience are correct, the specific type of experience may matter more to a prospective employer.
“Experience is more about the functions of the problems you’re trying to solve — so, have you worked with big data before? Have you worked with petabytes of data? Have you worked with streaming data? Have you worked with particular data science models such as customer lifetime value?” said Ian Swanson, CEO and founder of data science consulting and solutions firm DataScience, in an interview. “When we look for experience. It’s not just the years, it’s really about whether you’ve solved these types of problems before and, if so, at what scale.”
They’re Not Magicians
Data science can affect an organization in profound ways, but data scientists are not magicians. Even unicorns need to work as part of a team throughout the life cycle of a project to understand the problem, develop a hypothesis, gather the right data, analyze it, and present the findings to business leaders.
“You can’t just assume that data science is this magic bullet that you buy [to] solve your problem. All organizations are immersed in data. It’s just we have more of it now. It’s strewn across the organization, so you have to understand how you can integrate the insights you get from analytics with the insights you’re already using. It’s a process of assembling various threads of evidence,” said Michael Twidale, professor at the School of Information Sciences, University of Illinois at Urbana-Champaign and director of the Master of Science in Information Management degree program.
They May Have Diverse Backgrounds
Data scientists working for consulting organizations tend to have experience working with companies in different industries. Even in an enterprise context, a data scientist may have worked with many different departments to solve important problems.
“One thing I noticed [about data scientists] is people jump from domain to domain. I’ve done biomed, energy, and Groupon,” said Muhammad Aurangzeb Ahmad, senior data scientist at e-commerce marketplace Groupon. “The one thing that’s remained constant is the set of tools I use. On the surface, they may sound different, but if a person thinks about data science as a tool kit, or a way of thinking about the world, that’s the best way to approach it.”
Applied Knowledge Is Critical
The best way to weed out actual data scientists from those that need more education and/or training is to have them describe how they’ve applied their skills in the past, on which type of projects, the challenges they encountered, how those were resolved, and what the business outcome was. As with other technical positions, employers may also decide to tests those skills themselves to determine how well developed those skill sets really are, assuming the data scientist isn’t the same high-profile rock star everyone wants to employ.
“Look at the portfolios and talk about projects they’ve worked on. If you’re an interviewer who cannot evaluate data science projects, get somebody who can,” said Raj Bandyopadhyay, director of data science education at online educational website Springboard. “Typically, you should see a portfolio with strong programming skills and an acquaintance with the entire data science process, including data collecting, wrangling, exploration, and deep predictive analysis. Most importantly, they should be really good at communicating their work in a business context.”
They Understand Context
Many factors influence how a problem will be approached and solved, one of which is context. Without context, problem-solving efforts will be misdirected, a situation that can translate to wasted time and effort, or worse.
Failure to understand context may not be the data scientist’s fault entirely, however. Despite asking, he or she may not get the assistance needed to understand the finer points. Data science is a team sport, after all.
“Context is so important. Data scientists without context are like throwing spaghetti on the wall and hoping that something sticks,” said Prasanna Dhore, chief data and analytics officer at consumer credit reporting agency Equifax, in an interview. “You can’t just sit and do data-dredging and hope you can do something.”