Using standard manipulation and visualization techniques
to learn more about the data available and insights that
may be gained. Preparing data for further analysis and
modeling including the creation of new features.
Developing predictive models using appropriate machine
learning techniques for the data and task at hand.
Performing all elements of the model development
workflow from initial fit to model validation and
Being able to write reusable code to solve data
problems. Identifying when problems have occurred and
resolved them effectively, ultimately resulting in a
process suitable for production environments for solving
Presenting data in reports or dashboards to make
available to stakeholders and clearly presenting
actionable analytic results to business problems.
Typically these skills can take candidates 100+ hours to acquire
We tested the candidate's skills rigorously through:
Through a series of questions on a range of topics, we are
able to establish that this individual has the basic knowledge
required for a data scientist role. We make use of adaptive
testing approaches to understand to a high degree of
confidence the skill level of individuals who take the
Our coding challenges are free form, where candidates are
presented with certain data but it is up to them to come up
with an appropriate solution. The goal of this task is to
demonstrate that the individual has the ability to perform the
tasks required of them as a data scientist without being
guided towards the appropriate solution.
Case study submission
The final stage of the certification required the individual
to complete a case study. This stage of the certification is
graded manually and stringently by our data scientist experts.
The case study is split into two parts:
1. Technical report:
In the case of the technical report, the audience is a data
science manager. It can be considered that the work is being
presented to show how the task has been approached, why
certain actions were taken, and how the work helps to solve
the problem defined. There is no one right answer.
2. Non-technical presentation
The final stage was to adapt the information towards a non-technical audience. It is a common requirement for data scientists to have to present their work to others who have no background in data science. These audiences are interested in why the work was done and what the outcome was, typically not how it was done.