As we increasingly recognise the importance of data based decisions, Machine Learning and AI, we decided to survey to a population of Data Science professionals in order to produce a report to help aspiring Data Scientists make sure they pick up the right skills and experiences to excel in their chosen profession. The survey was distributed in May and we now intend to produce a report to share the findings.
The responses have led to a number of interesting finds, which I thought I would share in advance of the full report being developed.
A newer / younger profession:
In comparison with Wade Macdonald’s other specialisms, Data Science is very much in its infancy and this is reflected in the age of those that responded to us. Of the 100+ respondents, 48% were under the age of 34, with only 22% above the age of 45. 55% of those responding had under 5 years’ commercial experience and only 25% above 10 years. However, age was not a defining factor to the seniority of Data Science professionals, with 35% of the total pool working within Senior Positions. As a newer profession, we are able to conclude that people are able to progress quickly if they demonstrate the skills required.
Barrier to entry:
It would appear that there are some barriers to entry to the profession. I think it is fair to conclude that not gaining a degree would make it hard to get a job in Data Science. Only 4% of those that responded to us did not have any form of degree and remarkably just over a third of people had achieved at least a masters. 20% had a PHD.
The most common degrees were within either Computer Science or Statistics / Mathematics. Less than 10% of people responding had a degree that was not Science / Engineering based.
Languages / libraries most commonly used:
The most common programming languages used were SQL (59%) and Python (54%). R (28%), Matlab (17%) and C/C++ (16%) were also commonly used. When asked if Data Scientists would recommend just one language to use, it was Python at a huge 66%. This fits in well with what we see in the market where around 80% of the roles we have registered over the past 6 months have required competency in Python.
In terms of the Machine Learning Libraries most commonly used, there were again 2 clear favourites. Scikit-Learn was the most commonly used at 35% with Tensorflow just behind at 32%. Fewer used libraries were Keras, RandomForest and Pytorch, which were all used by around 15% of the respondents.
Unsurprisingly, Tableu was the most commonly used software for visualisation at nearly 40%, with Power BI and Matplotlib both also used by around a third of people
Industries investing in Data Science:
When asked which industries the respondents thought had invested the most in Data Science so far, there were no real surprises. 50% of people felt that Financial Services were ahead of the curve, with 45% identifying Retail and Ecommerce, nearly 40% Pharma / Healthcare and 33% Telecommunications. In the near future Data Scientists believe that Aerospace, Media and Utilities will start to invest more heavily and so we expect to see more interesting roles appear in these areas.
Hard / Soft skills:
A common misconception is that Data Scientists do not need to have many soft skills. This could not be further from the truth as business partnering is an important part of most roles. 71% of Data Scientists put communication as one of the top 3 softer skills needed to get on, with listening (52%) and storytelling (44%) also scoring highly. Unsurprisingly problem solving (76%) and critical thinking (63%) also scored highly. Perhaps unsurprisingly again, less emphasis was put on the importance of management skills. This does not mean that it will not become more important in the future, but often these roles are in small teams reporting to CTO’s, whereas in the future we would expect more management roles to develop.
In terms of the harder skills that were rated as highly important, programming (71%), Statistics (71%) and Data Visualisation (61%) were the top rated, with the lower rated being Software Engineering and ETL.
When asked what skills would be the most important for future generations the most commonly offered were Machine Learning, AI, Data Analysis, Problem Solving, Communication and Commercial Acumen.
4th Industrial Revolution:
If the Data Scientist that took part in our survey are to be believed, then the evolution of Data Science will be essential for the 4th industrial revolution. 80% of respondents felt that it definitely would be with only 4% stating it wouldn’t. We are already seeing roles in our other specialisms changing, with the need for Data Scientists to work alongside Finance, HR and Marketing for instance. This will only become more prevalent over the coming years.
Conclusion:
So in conclusion it would seem at the moment that for any aspiring Data Scientists there are definitely some things that they can do to get ahead. Make sure that you get a degree and seriously consider further studies beyond that, start introducing yourself to programming languages such as Python and work on your communication, listening and presenting skills. Without a doubt, Data Scientists will become more and more important over the next 5-10 years and will open new and exciting opportunities for the future working generations.
Written by Chris Goulding - Managing Director