Data Engineer - AI

Ellison Institute of Technology Oxford
Oxford
22 hours ago
Create job alert

Get AI-powered advice on this job and more exclusive features.


The Ellison Institute of Technology (EIT) Oxford's purpose is to have a global impact by fundamentally reimagining the way science and technology translate into end-to-end solutions and delivering these solutions in programmes and platforms that respond to humanity's most challenging problems.


EIT Oxford will ensure scientific discoveries and pioneering science are turned into products for the benefit of society that can have high-impact worldwide and, over time, be commercialised to ensure long-term sustainability.


Led by a world-class faculty of scientists, technologists, policy makers, economists and entrepreneurs, the Ellison Institute of Technology aims to develop and deploy commercially sustainable solutions to solve some of humanity's most enduring challenges. Our work is guided by four Humane Endeavours: Health, Medical Science & Generative Biology, Food Security & Sustainable Agriculture, Climate Change & Managing Atmospheric CO2 and Artificial Intelligence & Robotics.


Set for completion in 2027, the EIT Campus in Littlemore will include more than 300,000 sq ft of research laboratories, educational and gathering spaces. Fuelled by growing ambition and the strength of Oxford's science ecosystem, EIT is now expanding its footprint to a 2 million sq ft Campus across the western part of The Oxford Science Park. Designed by Foster + Partners led by Lord Norman Foster, this will become a transformative workplace for up to 7,000 people, with autonomous laboratories, purpose-built laboratories including a plant sciences building and dynamic spaces to spark interdisciplinary collaboration.


Requirements
The Role

Our Data Engineering Team builds the core data systems that power frontier research across EIT. As an early member of our Data Engineering team, you'll design and build the platforms used by scientists and engineers in fields such as healthcare, robotics, agriculture, and AI. You'll work alongside our MLOps and Infrastructure teams to create reliable, scalable systems capable of handling large-scale (from TB to PB+), multimodal datasets.


EIT is unique in combining foundational data from diverse disciplines into a single research ecosystem. You'll help develop the technical foundation that makes this possible: platforms, services, APIs and distributed systems that are robust, observable and easy to work with. This is a role for engineers who think long-term and want to build a platform that will underpin the next generation of scientific and technological discovery.


Day-to-Day, You Might:

  • Design and build distributed data systems that support research across EIT's scientific domains.
  • Architect APIs and services for high-throughput, low-latency access to multimodal datasets.
  • Work with MLOps, Infrastructure and data engineers embedded within research teams to integrate systems into active research workflows.
  • Develop pipelines for large-scale text, audio, video, imaging, sensor, and structured data on OCI.
  • Add observability, monitoring, and automated quality checks to ensure the trustworthiness of every dataset.
  • Contribute to an engineering culture that values maintainability, testing, clear system design, and deep collaboration with our researchers and scientists.

What Makes You a Great Fit

  • You have strong programming experience in Python and SQL, and value code quality, reliability (including testing, CI/CD) and observability as much as performance.
  • You have experience designing, deploying, and optimising distributed data systems or data-intensive backend services.
  • You think in terms of systems and longevity, not just one-off ETL scripts, and embrace end-to-end ownership from low-level performance to user interfaces.
  • You're a collaborative partner to Infrastructure/Ops teams and researchers; clear, respectful communicator.
  • You have a low-ego, team-first mindset and help grow our engineering culture by mentoring, sharing, and elevating the work of those around you.

Great to Also Have

  • You're used to working with modern tech stacks and developing for distributed systems, for example Spark/Flink/Kafka, Polars/Arrow, Airflow/Prefect.
  • You've contributed to shared Python libraries used across multiple teams and maintained dependency and packaging standards (e.g. Poetry, pip-tools).
  • You have experience integrating multimodal datasets (text, video, imaging, sensor data) into unified platforms.
  • You've designed and optimised robust, high-performance APIs for data ingestion/consumption using tools such as FastAPI, gRPC, and GraphQL, and use tools such as Prometheus and OpenTelemetry to maintain SLAs.
  • You're curious about database internals, storage engines, and low-latency query processing.
  • You've built web apps and dashboards using tools such as Dash or frameworks like React.
  • You've managed schema evolution, data versioning, and governance in production with tools such as Open Policy Agent and Apache Hive Metastore.

Benefits

  • Enhanced holiday pay
  • Pension
  • Life Assurance
  • Income Protection
  • Private Medical Insurance
  • Hospital Cash Plan
  • Therapy Services
  • Perk Box
  • Electric Car Scheme

Why work for EIT

At the Ellison Institute, we believe a collaborative, inclusive team is key to our success. We are building a supportive environment where creative risks are encouraged, and everyone feels heard. Valuing emotional intelligence, empathy, respect, and resilience, we encourage people to be curious and to have a shared commitment to excellence. Join us and make an impact!


#J-18808-Ljbffr

Related Jobs

View all jobs

Data Engineer

Data Engineer

Data Engineer

Data Engineer

Data Engineer

Data Engineer

Subscribe to Future Tech Insights for the latest jobs & insights, direct to your inbox.

By subscribing, you agree to our privacy policy and terms of service.

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

How to Write a Data Science Job Ad That Attracts the Right People

Data science plays a critical role in how organisations across the UK make decisions, build products and gain competitive advantage. From forecasting and personalisation to risk modelling and experimentation, data scientists help translate data into insight and action. Yet many employers struggle to attract the right data science candidates. Job adverts often generate high volumes of applications, but few applicants have the mix of analytical skill, business understanding and communication ability the role actually requires. At the same time, experienced data scientists skip over adverts that feel vague, inflated or misaligned with real data science work. In most cases, the issue is not a lack of talent — it is the quality and clarity of the job advert. Data scientists are analytical, sceptical of hype and highly selective. A poorly written job ad signals unclear expectations and immature data practices. A well-written one signals credibility, focus and serious intent. This guide explains how to write a data science job ad that attracts the right people, improves applicant quality and positions your organisation as a strong data employer.

Maths for Data Science Jobs: The Only Topics You Actually Need (& How to Learn Them)

If you are applying for data science jobs in the UK, the maths can feel like a moving target. Job descriptions say “strong statistical knowledge” or “solid ML fundamentals” but they rarely tell you which topics you will actually use day to day. Here’s the truth: most UK data science roles do not require advanced pure maths. What they do require is confidence with a tight set of practical topics that come up repeatedly in modelling, experimentation, forecasting, evaluation, stakeholder comms & decision-making. This guide focuses on the only maths most data scientists keep using: Statistics for decision making (confidence intervals, hypothesis tests, power, uncertainty) Probability for real-world data (base rates, noise, sampling, Bayesian intuition) Linear algebra essentials (vectors, matrices, projections, PCA intuition) Calculus & gradients (enough to understand optimisation & backprop) Optimisation & model evaluation (loss functions, cross-validation, metrics, thresholds) You’ll also get a 6-week plan, portfolio projects & a resources section you can follow without getting pulled into unnecessary theory.

Neurodiversity in Data Science Careers: Turning Different Thinking into a Superpower

Data science is all about turning messy, real-world information into decisions, products & insights. It sits at the crossroads of maths, coding, business & communication – which means it needs people who see patterns, ask unusual questions & challenge assumptions. That makes data science a natural fit for many neurodivergent people, including those with ADHD, autism & dyslexia. If you’re neurodivergent & thinking about a data science career, you might have heard comments like “you’re too distracted for complex analysis”, “too literal for stakeholder work” or “too disorganised for large projects”. In reality, the same traits that can make traditional environments difficult often line up beautifully with data science work. This guide is written for data science job seekers in the UK. We’ll explore: What neurodiversity means in a data science context How ADHD, autism & dyslexia strengths map to common data science roles Practical workplace adjustments you can request under UK law How to talk about your neurodivergence in applications & interviews By the end, you’ll have a clearer sense of where you might thrive in data science – & how to turn “different thinking” into a real career advantage.