Data Architect II

GSK
Stevenage
1 month ago
Applications closed

Related Jobs

View all jobs

Data Architect - Bristol - Hybrid Opportunity

Data Analyst

AWS Data Engineer (contract)

AWS Data Engineer   (Hybrid) Bristol – Spark, S3, Redshift

Data Architect

Data Architect

Overview

The Onyx Research Data Tech organization is GSK’s Research data ecosystem which has the capability to bring together, analyze, and power the exploration of data at scale. We partner with scientists across GSK to define and understand their challenges and develop tailored solutions that meet their needs. The goal is to ensure scientists have the right data and insights when they need it to give them a better starting point for and accelerate medical discovery. Ultimately, this helps us get ahead of disease in more predictive and powerful ways.


Onyx is a full-stack shop consisting of product and portfolio leadership, data engineering, infrastructure and DevOps, data / metadata / knowledge platforms, and AI/ML and analysis platforms, all geared toward:



  • Building a next-generation, metadata- and automation-driven data experience for GSK’s scientists, engineers, and decision-makers, increasing productivity and reducing time spent on “data mechanics”


  • Providing best-in-class AI/ML and data analysis environments to accelerate our predictive capabilities and attract top-tier talent


  • Aggressively engineering our data at scale, as one unified asset, to unlock the value of our unique collection of data and predictions in real-time



The Onyx Data Architecture team sits within the Data Engineering team, which is responsible for the design, delivery, support, and maintenance of industrialized automated end to end data services and pipelines. They apply standardized data models and mapping to ensure data is accessible for end users in end-to-end user tools through use of APIs. They define and embed best practices and ensure compliance with Quality Management practices and alignment to automated data governance. They also acquire and process internal and external, structure and unstructured data in line with Product requirements.


As a Data Architect II, you\'ll apply your expertise in big data and AI/GenAI workflows to support GSK\'s complex, regulated R&D environment.


You\'ll contribute to designing Data Mesh/Data Fabric architectures while enabling modern AI and machine learning capabilities across our platform.


Responsibilities

  • Partner with the Scientific Knowledge Engineering team to develop physical data models to build fit-for-purpose data products


  • Design data architecture aligned with enterprise-wide standards to promote interoperability


  • Collaborate with the platform teams and data engineers to maintain architecture principles, standards, and guidelines


  • Design data foundations that support GenAI workflows including RAG (Retrieval-Augmented Generation), vector databases, and embedding pipelines


  • Work across business areas and stakeholders to ensure consistent implementation of architecture standards


  • Lead reviews and maintain architecture documentation and best practices for Onyx and our stakeholders


  • Adopt security-first design with robust authentication and resilient connectivity


  • Provide best practices and leadership, subject matter, and GSK expertise to architecture and engineering teams composed of GSK FTEs, strategic partners, and software vendors.



Basic Qualifications

  • Bachelor’s degree in computer science, engineering, Data Science or similar discipline


  • 5+ years of experience in data architecture, data engineering, or related fields in pharma, healthcare, or life sciences R&D


  • 3+ years’ experience of defining architecture standards, patterns on Big Data platforms


  • 3+ years’ experience with data warehouse, data lake, and enterprise big data platforms


  • 3+ years’ experience with enterprise cloud data architecture (preferably Azure or GCP) and delivering solutions at scale


  • 3+ years of hands-on relational, dimensional, and/or analytic experience (using RDBMS, dimensional, NoSQL data platform technologies, and ETL and data ingestion protocols)



Preferred Qualifications

  • Master's or PhD in computer science, engineering, Data Science or similar discipline


  • Deep knowledge and use of at least one common programming language: e.g., Python, Scala, Java


  • Experience with AI/ML data workflows: feature stores, vector databases, embedding pipelines, model serving architectures


  • Familiarity with GenAI/LLM data patterns: RAG architectures, prompt engineering data requirements, fine-tuning data preparation


  • Experience with GCP data/analytics stack: Spark, Dataflow, Dataproc, GCS, Bigquery


  • Experience with enterprise data tools: Ataccama, Collibra, Acryl


  • Experience with Agile frameworks: SAFe, Jira, Confluence, Azure DevOps


  • Experience applying CI/CD principles to data solution


  • Experience with Spark and RAG-based architectures for data science and ML use cases


  • Strong communication skills—ability to explain technical concepts to non-technical stakeholders


  • Pharmaceutical, healthcare, or life sciences background



Why GSK?


GSK is an Equal Opportunity Employer. This ensures that all qualified applicants will receive equal consideration for employment without regard to race, color, religion, sex, national origin, age, disability, or any basis prohibited under federal, state or local law.


We believe in an agile working culture for all our roles. If flexibility is important to you, we encourage you to explore with our hiring team what the opportunities are.


Should you require any adjustments to our process to assist you in demonstrating your strengths and capabilities contact us at where you can also request a call. Please note should your enquiry not relate to adjustments, we will not be able to support you through these channels. However, we have created a Recruitment FAQ guide. Click the link where you will find answers to multiple questions we receive.


Important notice to Employment businesses/ Agencies: GSK does not accept referrals from employment businesses and/or employment agencies in respect of the vacancies posted on this site. All employment businesses/agencies are required to contact GSK\'s commercial and general procurement/human resources department to obtain prior written authorization before referring any candidates to GSK. The obtaining of prior written authorization is a condition precedent to any agreement between the employment business/agency and GSK.


GSK may be required to capture and report expenses if you are a US Licensed Healthcare Professional or Healthcare Professional as defined by the laws of the state issuing your license. For more information, please visit the CMS website at https://openpaymentsdata.cms.gov/.



#J-18808-Ljbffr

Subscribe to Future Tech Insights for the latest jobs & insights, direct to your inbox.

By subscribing, you agree to our privacy policy and terms of service.

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

Where to Advertise Data Science Jobs in the UK (2026 Guide)

Advertising data science jobs in the UK requires a different approach to most technical hiring. Data science spans a broad and often misunderstood spectrum — from statistical modelling and experimental design through to machine learning engineering, product analytics and AI research. The strongest candidates identify firmly with specific subdisciplines and are frustrated by adverts that conflate data scientist with data analyst, business intelligence developer or machine learning engineer. General job boards produce high application volumes for data roles but consistently fail to match specialist data science profiles with the right opportunities. This guide, published by DataScienceJobs.co.uk, covers where to advertise data science roles in the UK in 2026, how the main platforms compare, what employers should expect to pay, and what the data says about hiring across different role types.

New Data Science Employers to Watch in 2026: UK and International Companies Leading Analytics and AI Innovation

Data science has emerged as one of the most transformative forces across industries, turning raw information into actionable insights, predictive models, and AI-powered solutions. In 2026, the UK is witnessing a surge in organisations where data science is not just a support function but the core of their products and services. For professionals exploring opportunities on www.DataScience-Jobs.co.uk , identifying these employers early can provide a competitive advantage in a market with high demand for advanced analytics and machine learning expertise. This article highlights new and high-growth data science employers to watch in 2026, focusing on UK startups, scale-ups, and global firms expanding their data science operations locally. All of the companies included have recently raised investment, won high-profile contracts, or significantly scaled their analytics teams.

How Many Data Science Tools Do You Need to Know to Get a Data Science Job?

If you’re trying to break into data science — or progress your career — it can feel like you are drowning in names: Python, R, TensorFlow, PyTorch, SQL, Spark, AWS, Scikit-learn, Jupyter, Tableau, Power BI…the list just keeps going. With every job advert listing a different combination of tools, many applicants fall into a trap: they try to learn everything. The result? Long tool lists that sound impressive — but little depth to back them up. Here’s the straight-talk version most hiring managers won’t explicitly tell you: 👉 You don’t need to know every data science tool to get hired. 👉 You need to know the right ones — deeply — and know how to use them to solve real problems. Tools matter, but only in service of outcomes. So how many data science tools do you actually need to know to get a job? For most job seekers, the answer is not “27” — it’s more like 8–12, thoughtfully chosen and well understood. This guide explains what employers really value, which tools are core, which are role-specific, and how to focus your toolbox so your CV and interviews shine.