Data Architect II

1054 GlaxoSmithKline Services Unlimited
Stevenage
3 days ago
Create job alert
Overview

The Onyx Research Data Tech organization is GSK’s Research data ecosystem which has the capability to bring together, analyze, and power the exploration of data at scale. We partner with scientists across GSK to define and understand their challenges and develop tailored solutions that meet their needs. The goal is to ensure scientists have the right data and insights when they need it to give them a better starting point for and accelerate medical discovery. Ultimately, this helps us get ahead of disease in more predictive and powerful ways.

Onyx is a full-stack shop consisting of product and portfolio leadership, data engineering, infrastructure and DevOps, data / metadata / knowledge platforms, and AI/ML and analysis platforms, all geared toward:

  • Building a next-generation, metadata- and automation-driven data experience for GSK’s scientists, engineers, and decision-makers, increasing productivity and reducing time spent on “data mechanics”

  • Providing best-in-class AI/ML and data analysis environments to accelerate our predictive capabilities and attract top-tier talent

  • Aggressively engineering our data at scale, as one unified asset, to unlock the value of our unique collection of data and predictions in real-time

Team

The Onyx Data Architecture team sits within the Data Engineering team, which is responsible for the design, delivery, support, and maintenance of industrialized automated end to end data services and pipelines. They apply standardized data models and mapping to ensure data is accessible for end users in end-to-end user tools through use of APIs. They define and embed best practices and ensure compliance with Quality Management practices and alignment to automated data governance. They also acquire and process internal and external, structure and unstructured data in line with Product requirements.

As a Data Architect II, you'll apply your expertise in big data and AI/GenAI workflows to support GSK's complex, regulated R&D environment.

You'll contribute to designing Data Mesh/Data Fabric architectures while enabling modern AI and machine learning capabilities across our platform.

You will be responsible for…
  • Partner with the Scientific Knowledge Engineering team to develop physical data models to build fit-for-purpose data products

  • Design data architecture aligned with enterprise-wide standards to promote interoperability

  • Collaborate with the platform teams and data engineers to maintain architecture principles, standards, and guidelines

  • Design data foundations that support GenAI workflows including RAG (Retrieval-Augmented Generation), vector databases, and embedding pipelines

  • Work across business areas and stakeholders to ensure consistent implementation of architecture standards

  • Lead reviews and maintain architecture documentation and best practices for Onyx and our stakeholders

  • Adopt security-first design with robust authentication and resilient connectivity

  • Provide best practices and leadership, subject matter, and GSK expertise to architecture and engineering teams composed of GSK FTEs, strategic partners, and software vendors.

Why you?

Basic Qualifications:

  • Bachelor’s degree in computer science, engineering, Data Science or similar discipline

  • 5+ years of experience in data architecture, data engineering, or related fields in pharma, healthcare, or life sciences R&D.

  • 3+ years’ experience of defining architecture standards, patterns on Big Data platforms

  • 3+ years’ experience with data warehouse, data lake, and enterprise big data platforms

  • 3+ years’ experience with enterprise cloud data architecture (preferably Azure or GCP) and delivering solutions at scale

  • 3+ years of hands-on relational, dimensional, and/or analytic experience (using RDBMS, dimensional, NoSQL data platform technologies, and ETL and data ingestion protocols)

Preferred Qualifications:
  • Master\'s or PhD in computer science, engineering, Data Science or similar discipline
  • Deep knowledge and use of at least one common programming language: e.g., Python, Scala, Java
  • Experience with AI/ML data workflows: feature stores, vector databases, embedding pipelines, model serving architectures
  • Familiarity with GenAI/LLM data patterns: RAG architectures, prompt engineering data requirements, fine-tuning data preparation
  • Experience with GCP data/analytics stack: Spark, Dataflow, Dataproc, GCS, BigQuery
  • Experience with enterprise data tools: Ataccama, Collibra, Acryl
  • Experience with Agile frameworks: SAFe, Jira, Confluence, Azure DevOps
  • Experience applying CI/CD principles to data solution
  • Experience with Spark and RAG-based architectures for data science and ML use cases
  • Strong communication skills—ability to explain technical concepts to non-technical stakeholders
  • Pharmaceutical, healthcare, or life sciences background

#GSKOnyx

#LI-GSK

Why GSK?

Uniting science, technology and talent to get ahead of disease together.

GSK is a global biopharma company with a purpose to unite science, technology and talent to get ahead of disease together. We aim to positively impact the health of 2.5 billion people by the end of the decade, as a successful, growing company where people can thrive. We get ahead of disease by preventing and treating it with innovation in specialty medicines and vaccines. We focus on four therapeutic areas: respiratory, immunology and inflammation; oncology; HIV; and infectious diseases – to impact health at scale.

People and patients around the world count on the medicines and vaccines we make, so we’re committed to creating an environment where our people can thrive and focus on what matters most. Our culture of being ambitious for patients, accountable for impact and doing the right thing is the foundation for how, together, we deliver for patients, shareholders and our people.

GSK is an Equal Opportunity Employer. This ensures that all qualified applicants will receive equal consideration for employment without regard to race, color, religion, sex (including pregnancy, gender identity, and sexual orientation), parental status, national origin, age, disability, genetic information (including family medical history), military service or any basis prohibited under federal, state or local law.

We believe in an agile working culture for all our roles. If flexibility is important to you, we encourage you to explore with our hiring team what the opportunities are.


#J-18808-Ljbffr

Related Jobs

View all jobs

Data Architect II — GenAI & Data Mesh Expert

Data Architect II — GenAI & Data Mesh Specialist

Data Engineer II: Architect Scalable, Reliable Pipelines

Data Architect

Data Architect

Lead Data Engineer (Azure/Databricks)

Subscribe to Future Tech Insights for the latest jobs & insights, direct to your inbox.

By subscribing, you agree to our privacy policy and terms of service.

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

What Hiring Managers Look for First in Data Science Job Applications (UK Guide)

If you’re applying for data science roles in the UK, it’s crucial to understand what hiring managers focus on before they dive into your full CV. In competitive markets, recruiters and hiring managers often make their first decisions in the first 10–20 seconds of scanning an application — and in data science, there are specific signals they look for first. Data science isn’t just about coding or statistics — it’s about producing insights, shipping models, collaborating with teams, and solving real business problems. This guide helps you understand exactly what hiring managers look for first in data science applications — and how to structure your CV, portfolio and cover letter so you leap to the top of the shortlist.

The Skills Gap in Data Science Jobs: What Universities Aren’t Teaching

Data science has become one of the most visible and sought-after careers in the UK technology market. From financial services and retail to healthcare, media, government and sport, organisations increasingly rely on data scientists to extract insight, guide decisions and build predictive models. Universities have responded quickly. Degrees in data science, analytics and artificial intelligence have expanded rapidly, and many computer science courses now include data-focused pathways. And yet, despite the volume of graduates entering the market, employers across the UK consistently report the same problem: Many data science candidates are not job-ready. Vacancies remain open. Hiring processes drag on. Candidates with impressive academic backgrounds fail interviews or struggle once hired. The issue is not intelligence or effort. It is a persistent skills gap between university education and real-world data science roles. This article explores that gap in depth: what universities teach well, what they often miss, why the gap exists, what employers actually want, and how jobseekers can bridge the divide to build successful careers in data science.

Data Science Jobs for Career Switchers in Their 30s, 40s & 50s (UK Reality Check)

Thinking about switching into data science in your 30s, 40s or 50s? You’re far from alone. Across the UK, businesses are investing in data science talent to turn data into insight, support better decisions and unlock competitive advantage. But with all the hype about machine learning, Python, AI and data unicorns, it can be hard to separate real opportunities from noise. This article gives you a practical, UK-focused reality check on data science careers for mid-life career switchers — what roles really exist, what skills employers really hire for, how long retraining typically takes, what UK recruiters actually look for and how to craft a compelling career pivot story. Whether you come from finance, marketing, operations, research, project management or another field entirely, there are meaningful pathways into data science — and age itself is not the barrier many people fear.