UZIMA-DS Project
UZIMA-DS: Utilizing Health Information for Meaningful Impact in East Africa through Data Science
Project Period: 2021 – 2026
Funder: National Institute of Health
Collaborators: Aga Khan University; University of Michigan; Dalhousie University; Ottawa University; Kenya Medical Research Institute- Wellcome Trust Research Programme; Clinton Health Access Initiative
Overall Project Summary: The UZIMA-DS (UtiliZe health Information for Meaningful impact in East Africa through Data Science) Hub aims to create a scalable and sustainable platform to apply novel approaches to data assimilation and advanced artificial intelligence/machine learning-based methods to serve as early warning systems to critical health issues impacting Africans in two domains: maternal, new-born and child health and mental health.
UZIMA-DS brings together method experts in statistics, computer science, and informatics, healthdomain experts and practitioners, and partnerships with key stakeholders to not only improve the quality, efficiency, and relevance of multidisciplinary data science in health research, but also its transparency, reproducibility, and dissemination for sustainable impact in Africa. Thus, helping ensure current and future generations of Africans can achieve uzima (health/well-being in Swahili).
Ultimately, the UZIMA-DS hub will develop a scalable and sustainable platform characterised by:
- Harmonization of multimodal data sources for meaningful use and analyses.
- Leveraging temporal patterns of data to identify trajectories through prediction modelling using AI/ML-based methods, and
- Engaging with key stakeholders to identify pathways for dissemination and sustainability of these models into target communities.
Data Management and Access Core (DMAC)
The Data Innovation team plays a vital role in supporting the data infrastructure needs of the UZIMA-DS project. Operating within a cloud-first, open-source environment, the UZIMA-DS architecture undergoes continuous iteration, incorporating emerging data algorithms, best practices, and standards. The team's primary responsibility is to ensure that all data required by researchers undergoes processing and cleansing through high-quality data pipelines and is then organised within a data model optimised for analysis. Additionally, the DIO team provides essential training and support to researchers on accessing cloud resources and utilizing tools effectively. Furthermore, the team ensures compliance with Kenyan Data Protection laws and serves as the primary contact with the Office of the Data Protection Commissioner.
We are part of the Data Management and Access Core for the UZIMA-DS Hub. Our work involves facilitating and supporting effective data management and analysis using FAIR (Findable, Accessible, Interoperable, Reusable) principles, for the UZIMA-DS Research Hub, cross-DS-I Africa consortium collaborators, and the DS-I Africa Coordinating Centre. The DMAC addresses the following objectives:
- Support the Research Hub’s data ecosystem through the development and maintenance of data quality assurance measures, standards for statistical code sharing, data reproducibility, sharing and interoperability.
- Facilitate data analytics utilizing AI/ML methods and provide analytical support for the Hub’s research projects, and
- Foster data sharing, interoperability, and meta-data approaches across the greater DS-I Africa Consortium.
The long-term goal of the DMAC is to develop a pipeline of data support, data use and data sharing capacity to facilitate high-quality research in East Africa with the potential for a model platform that can be scalable, reproducible, and shareable.
Project Title: Sustainable Cloud Operations for Research (SCORE)
Overview
SCORE introduces a practical framework that helps research teams in low- and middle-income countries (LMICs) design and manage cost-efficient, high-performance, and environmentally sustainable cloud data pipelines. Our guidelines address the real-world challenges of limited budgets, technical capacity, and connectivity common in LMIC research settings.
Using a three-phase model, Assessment, Selection, and Optimization, SCORE guides health research teams in designing the right data pipelines, controlling costs, optimizing performance and minimizing carbon emissions. This provides a clear pathway to adopt cloud technologies responsibly, advancing both scientific innovation and environmental sustainability across global health research.
Read the full report here: SCORE Report V_Oct 30.pdf