The Rise of the Citizen Data Scientist
This article was originally written as a summary of the UA Summit 2021 Community Conversation.
The citizen data scientist is part of a broadening of data science job titles
To understand the rise of the citizen data scientist, we must understand the evolving role of centralised data science functions and the rationale behind industry digitisation. Even small utilities will employ many analysts across the organisation, from business analysts in finance departments, to specialists in power engineering or energy trading. However, data scientists are much rarer. Outside tier one utilities, their numbers are low, if any are employed at all.
But as the industry becomes more data-reliant, it must improve its analytics skills to maximise the value this data brings. Data science teams are central to this skills development. As utility data science teams mature and grow, new roles are being created to fill gaps beyond core data science skills. These include management and governance, data preparation and transformation, modelling, and software engineering (including developing machine learning). While data scientists could do all these roles, it’s not desirable. You want your data scientists to focus on where they can add the most value: the most business-critical projects that require specific data science skills.
This leaves a whole raft of other data projects, which may be less high profile, but are still important. A company may choose to add these to a growing backlog of projects for data science teams to tackle. However, many will instead turn to a new breed of citizen data scientists, recruited from departments across the organisation. Many staff are keen to get involved in analytics projects and use new technologies to improve their business processes. These people have the potential to rapidly move a utility forward.
Citizen data scientists fill a skills gap
However, it is a grave error to throw analytics tools at a workforce and leave them to their own devices. Data scientists and citizen data scientists have hugely different skill sets. Which means their approach to data and analytics will be polar opposites. Data scientists have a theoretical understanding of bad data and bad data collection. Citizen data scientists will likely have no experience of data collection. Data scientists will dive deep; citizen data scientists will get excited by data and start processing it. One of the first questions a data scientist will ask is ‘what biases sit in this data?’ and ‘how will these biases affect my output?’ A citizen data scientist will ask ‘how can I improve my job with this data?’
Governance is required if citizen data scientists are to add value, not destroy it
The role of citizen data scientist evolved to fill a skills gap in most organisations. A central data science function will always be limited in size and scope and constrains a company’s requirement to solve business problems with analytics. However, simply opening the floodgates and allowing everyone to use analytics tools will likely lead to more problems than it solves. Citizen data scientists are volunteers: they come forward and proactively ask to gain access to data, analytics, and training. They are innovative, creative people. But they need direction and well-defined frameworks to develop their skills.
The right approach is to pragmatically optimise resources—people, analytics tools, and data—to solve business problems. Governance processes must be put in place to ensure citizen data scientists add value, not destroy it.
Access to analytics tools must have proper controls. A libertarian approach with little governance will lead to actions being taken on poorly understood data. For example, how would management know if a report is based on certified data, or went through a formal review process? It could easily have been pulled together on the fly by an unsupervised analyst with a few hours of Python training.
How can an organisation manage models if hundreds or thousands of employees build models on their own devices? Who checks to see if the model is still valid, if the data inputs can be changed, or if it should be replaced entirely?
Encourage, educate, communicate
I can’t imagine a time when any company hires a citizen data scientist. By definition, they are experts in another field. They volunteer to use these tools because they have an interest in data and analytics. Therefore, an innovation culture must be engendered, which fosters enthusiasm for data science. Staff should be encouraged to develop analytics skills. Best practices developed on live projects should be shared across teams. They should be given access—within the boundaries of a pragmatic governance process—to analytics tools. They should be given training in analytics techniques. Nor should they be left on their own: data scientists should be made available to teams across the enterprise to collaborate on analytics projects.
And remember that data science is learnt by doing. Citizen data scientists must learn from someone who knows what they are doing. Data scientists will be vital in this education program: both by creating formal training sessions, but also by working alongside people doing analytics on the job.
There are strong arguments for utilities to accelerate the development of citizen data scientists across their organisations. But there are significant risks to doing so. Strong governance and training will be essential. As will stewardship from existing data science teams.