Principal Data Engineer

Oritain
London
6 days ago
Create job alert
Company Overview

Oritain is the global leader in product verification, with locations in Auckland, Dunedin, London, Singapore and Washington D.C.


Our Mission: Harnessing our data to protect people & planet

Our mission is to protect people and the planet by harnessing science, technology and services to build a community of origin verified suppliers and buyers.


Role: Principal Data Engineer

We are looking for a Principal Data Engineer to lead on the transformation of our entire data platform. This is a critical leadership role responsible for defining, building and running the scalable, robust, and trustworthy data infrastructure that will underpin all future product development, scientific analysis and business operations.


The Opportunity: Founding the Data Platform

Reporting to the Head of Engineering, you will be the most senior technical voice for data platforms within the organisation. You will own the strategy, design, and initial implementation of the pipelines and architecture required to integrate complex scientific data with our commercial software applications.


You will act as a technical leader and mentor to the wider engineering team, ensuring that all data‑related systems meet the highest standards of reliability, performance, and security.


Key Responsibilities
Data Architecture & Strategy

  • Platform Leadership: Define and own the technical strategy and architecture for our entire data platform covering ingestion, storage, processing, governance, and consumption, including use‑cases in support of Operations, Data Science, Customer‑Facing Portals and Business Intelligence.
  • Pipeline Design: Design and implement highly scalable, performant, and reliable ETL/ELT data pipelines to handle diverse data sources, including complex scientific datasets and supply chain inputs alongside business information.
  • Technology Selection: Evaluate, recommend, and drive the adoption of new data services and modern data tools to ensure we have a future‑proof data ecosystem.
  • Data Modeling: Lead the design of canonical data models for our data warehouse and operational data stores, ensuring data quality, consistency, and integrity across the platform.
  • Single Source of Truth: Define and maintain identifiers for clients, suppliers and transactions to ensure consistency across systems such as Salesforce, Netsuite, internal databases and portals.

Implementation & Technical Excellence

  • Hands‑on Development: Serve as the most senior, hands‑on developer, writing high‑quality, production‑grade code (primarily Python and/or Scala/Spark) to build initial pipelines and core data services.
  • Data Governance & Security: Architect data security and governance policies, ensuring compliance and best practices around data access, masking, and retention especially for sensitive origin data.
  • Data Quality: Implement automated deduplication, conflict resolution and anomaly detection to maintain data integrity across ingestion sources.
  • Operational Health: Implement robust monitoring, logging, and alerting for all data pipelines and infrastructure, ensuring high data reliability and performance.
  • Infrastructure as Code (IaC): Work closely with the Infrastructure team to define and automate the provisioning of all Azure data resources using Terraform or similar IaC tools.

Cross‑Functional Leadership

  • Scientific Collaboration: Partner closely with the Science teams to understand the structure, complexity, and requirements of raw scientific data, ensuring accurate data translation and ingestion.
  • Mentorship: Provide technical guidance and mentorship to software engineers on best practices for interacting with and consuming data services.
  • Product Partnership: Collaborate with the Product Director to understand commercial and user‑facing data requirements, translating these needs into actionable data infrastructure features.

The Engineering Environment

  • Technology: We currently make extensive use of Microsoft Azure and related data services and are moving to Databricks. This role will be an authority across both.
  • Collaboration: You will be the technical data expert, integrating with the Software Engineering, Data Science and Product teams.
  • Work Style: London office, with a minimum requirement of three days per week on‑site to facilitate strategic planning and hands‑on collaboration.

Skills & Experience

  • Principal/Lead Expertise: Extensive experience (typically 7+ years) focused on data engineering, including significant time spent in a Principal, Lead, or Architect role defining data strategy from the ground up.
  • Databricks: Deep, practical, and architectural experience of the Databricks platform.
  • Azure Data Stack: Operational experience of building and running within the Microsoft Azure data ecosystem (e.g., Azure Data Factory, Azure Data Lake, Azure Synapse Analytics, Azure SQL/Cosmos DB).
  • Coding Proficiency: Expert‑level proficiency in Python (or Scala) and SQL, with a strong focus on writing clean, tested, and highly performant data processing code.
  • Data Warehouse Design: Proven track record designing and implementing scalable data warehouses/data marts for analytical and operational use cases.
  • Pipeline Automation: Strong experience with workflow orchestration tools and implementing CI/CD for data pipelines.
  • Cloud Infrastructure: Familiarity with Infrastructure as Code (Terraform) and containerisation.

Desirable Attributes

  • Experience processing scientific, geospatial or time‑series data.
  • Experience in the governance or compliance sector where data integrity is paramount.
  • Familiarity with streaming data technologies.

Company Benefits

  • Paid Leave – 35 days (inclusive of public holidays)
  • Birthday Off
  • Volunteering Leave Allowance
  • Enhanced Parental Leave
  • Life Insurance
  • Healthcare Cash Plan
  • Employee Assistance Programme (EAP)
  • Pension
  • Monthly Wellbeing Allowance
  • Breakfast, Snacks, Friday lunch & Barista Coffee Machine in the office
  • Learning Portal with over 100,000 assets available to support professional development
  • Hybrid working set‑up (Farringdon, London)
  • Plenty of friendly 4‑legged pets in the office

About Oritain

Oritain is a global leader in forensic origin verification. Using cutting‑edge science, advanced technology, and specialised services, we independently verify where products and raw materials come from – protecting brand integrity, supporting compliance, and strengthening supply chain trust and transparency. Our method is highly resistant to tampering, court‑admissible, and trusted by suppliers and manufacturers, brands and retailers, consumers, and regulators.


Driven by purpose, we are committed to advancing the scientific techniques and systems needed to identify the origin of the world’s most critical commodities – enabling more ethical, resilient, and accountable supply chains.


#J-18808-Ljbffr

Related Jobs

View all jobs

Principal Data Engineer

Principal Data Engineer

Principal Data Engineer

Principal Data Engineer

Principal Data Engineer

Principal Data Engineer

Subscribe to Future Tech Insights for the latest jobs & insights, direct to your inbox.

By subscribing, you agree to our privacy policy and terms of service.

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

Data Science Jobs for Career Switchers in Their 30s, 40s & 50s (UK Reality Check)

Thinking about switching into data science in your 30s, 40s or 50s? You’re far from alone. Across the UK, businesses are investing in data science talent to turn data into insight, support better decisions and unlock competitive advantage. But with all the hype about machine learning, Python, AI and data unicorns, it can be hard to separate real opportunities from noise. This article gives you a practical, UK-focused reality check on data science careers for mid-life career switchers — what roles really exist, what skills employers really hire for, how long retraining typically takes, what UK recruiters actually look for and how to craft a compelling career pivot story. Whether you come from finance, marketing, operations, research, project management or another field entirely, there are meaningful pathways into data science — and age itself is not the barrier many people fear.

How to Write a Data Science Job Ad That Attracts the Right People

Data science plays a critical role in how organisations across the UK make decisions, build products and gain competitive advantage. From forecasting and personalisation to risk modelling and experimentation, data scientists help translate data into insight and action. Yet many employers struggle to attract the right data science candidates. Job adverts often generate high volumes of applications, but few applicants have the mix of analytical skill, business understanding and communication ability the role actually requires. At the same time, experienced data scientists skip over adverts that feel vague, inflated or misaligned with real data science work. In most cases, the issue is not a lack of talent — it is the quality and clarity of the job advert. Data scientists are analytical, sceptical of hype and highly selective. A poorly written job ad signals unclear expectations and immature data practices. A well-written one signals credibility, focus and serious intent. This guide explains how to write a data science job ad that attracts the right people, improves applicant quality and positions your organisation as a strong data employer.

Maths for Data Science Jobs: The Only Topics You Actually Need (& How to Learn Them)

If you are applying for data science jobs in the UK, the maths can feel like a moving target. Job descriptions say “strong statistical knowledge” or “solid ML fundamentals” but they rarely tell you which topics you will actually use day to day. Here’s the truth: most UK data science roles do not require advanced pure maths. What they do require is confidence with a tight set of practical topics that come up repeatedly in modelling, experimentation, forecasting, evaluation, stakeholder comms & decision-making. This guide focuses on the only maths most data scientists keep using: Statistics for decision making (confidence intervals, hypothesis tests, power, uncertainty) Probability for real-world data (base rates, noise, sampling, Bayesian intuition) Linear algebra essentials (vectors, matrices, projections, PCA intuition) Calculus & gradients (enough to understand optimisation & backprop) Optimisation & model evaluation (loss functions, cross-validation, metrics, thresholds) You’ll also get a 6-week plan, portfolio projects & a resources section you can follow without getting pulled into unnecessary theory.