• Mae'r wefan hon ar gael yn y Gymraeg

“Information is the oil of the 21st century, and analytics is the combustion engine"
Peter Sondergaard, Gartner Research

Data science

Data science is the field of extracting knowledge and insight from data, especially through the application of scientific methods and emerging technology.

The Venn diagram below illustrates how data science bridges the gap between mathematics and computer science to further our understanding of the data we hold.

Data Science Venn Diagram

Some techniques from data science and related fields include:

  • data mining and web scraping;
  • predictive modelling and time series analysis;
  • clustering and categorisation algorithms;
  • Natural Language Processing (NLP);
  • machine learning, deep learning and A.I.; and
  • Robotic Process Automation (RPA).

We’re also invested in learning from the standards of data science practitioners in areas such as:

  • reproducible analytical pipelines and reproducible research;
  • streamlined deployment;
  • agile and responsible team coding; and
  • software engineering.

Our strategic focus

At Data Cymru, we recognise the value of applying data science methods in the public sector. Data science is central to our long-term strategy, enriching the tools and services we offer as well as supporting our internal efficiency. We are realising this strategy through a dedicated Data Science Team and through up-skilling of staff in associated methods and technologies.

Our Data Science team are focused on the following key areas:

Exploring new data visualisation options in our reports and data products
Deepening our insight into Welsh datasets with advanced analysis
Up-skilling our team in exciting new technologies and methods
Supporting our Open Data Strategy
Improving the efficiency and quality of core delivery throughout the business
Developing business tools to modernise our internal management information

Internal success- R Shiny

We piloted the internal use of an R Shiny server in late 2019, and published our first tools using this server in early 2020. R Shiny is a technology which uses the statistical power of the R language to create rich and intuitive dashboards in your web browser. We are particularly interested in using R Shiny mapping packages such as Leaflet and Mapview, which can effectively communicate geographical data.

An example of an R Shiny application is shown below.

R Shiny Dashboard Example

Internal success - Robotic Process Automation (RPA)

At Data Cymru, we maintain several regular data collections, from a wide variety of sources. The related data management and database administration is very process driven. As part of our continuous improvement, we are automating the most repetitive, time-consuming and error-prone areas of our work. We aim to increase our use of reproducible analytical pipelines and explore the application of Robotic Process Automation (RPA) in 2020.

If you’d like to talk to us about our data science work or how we might support you in this area please get in touch.

Contact us

Dr Rob Pascoe (PhD)

Rob is a Data Scientist at Data Cymru. His role is to deliver data science solutions and to foster a modern approach to data throughout the organisation. He is currently developing internal up-skilling courses in the R and Python languages and is also working with our partners on multiple projects.

029 2090 9569

Rob.Pascoe@data.cymru

Useful resources

Guidance documents

Data Science (UC Berkeley)

Data Science History (Forbes)

Training

Data Science for Public Good (ONS Data Science Campus)
Available to attend for Senior Managers in ONS or GSS only, but the slides are publicly available.