Engineer the Quantum RevolutionYour expertise can help us shape the future of quantum computing at Oxford Ionics.

View Open Roles

Data Engineer

Expleo
London
3 weeks ago
Applications closed

Related Jobs

View all jobs

Data Engineer

Data Engineer

Data Engineer

Data Engineer

Data Engineer

Data Engineer

Overview

We are looking for a data engineer experienced in DevOps-based pipeline delivery, who can not only develop the pipeline but also establish the foundational framework for reusable data ingestion processes. The ideal candidate is proactive, a self-starter, and demonstrates a strong can-do attitude.

While not essential, experience with Health Data systems would be highly advantageous.

Responsibilities

  • Ingestion Framework Delivery: Responsible for building reusable metadata driven data pipelines within a framework to handle batch and near-real-time data feeds.
  • Data Pipeline Development: Develop end-to-end data pipelines, including data load patterns, error handling, automation, and hardware optimisation.
  • Requirements Formulation: Collaborate with Business Analysts, Architects, SMEs, and business teams to define requirements and implement solutions using modern EDW cloud tools and best practices.
  • Detailed Solution Design: Work with architects and analysts to create detailed solution designs for data pipelines, ensuring adherence to policies, rules, and standards.
  • Promote DevOps best practices for iterative solution delivery including CI/CD, version control, monitoring and alerting, automated testing and IaC.
  • Data Modelling and Warehousing: Build and optimise pipelines to populate data stores such as DWH, Lakehouse, and other repositories, following industry and clinical standards like openEHR, FHIR and OMOP.
  • Data Quality & Governance: Apply data quality checks, validation rules, and governance policies to maintain the accuracy, completeness, and reliability of clinical data, and address any data discrepancies.
  • Data Integration: Integrate various clinical datasets, ensuring proper mapping, standardisation, and harmonisation across systems and terminologies (e.g., SNOMED CT, LOINC, ICD-10).
  • Performance Optimisation: Monitor and enhance the performance of data pipelines, warehouses, and queries for efficient data processing.
  • Operational Controls: Apply operational procedures, security practices, and production policies to ensure high-quality service delivery.
  • Collaboration: Work with clinical stakeholders, data scientists, analysts, and other professionals to define data requirements and deliver technical solutions. Lead showcase sessions after each delivery.
  • Documentation: Maintain comprehensive technical documentation for data architectures, pipelines, models, metadata, and processes.
  • Troubleshooting & Support: Provide technical support and resolve issues related to data pipelines and data quality.
  • Innovation & Best Practices: Stay updated on new data engineering technologies and best practices, especially in healthcare, and recommend adoption as needed.
  • Lead proof of concepts, pilots, and develop data pipeline using agile and iterative methods.

Qualifications

  • Certifications such as DP 203 and AZ 900 or similar certification/experience.

Essential skills

  • Experience in working with healthcare data to build a healthcare data store would be a significant plus. Which should include standards and interoperability protocols (e.g., openEHR, FHIR, HL7, DICOM, CDA).
  • Experience in converting one healthcare data to develop an ODS or DWH.
  • Experience in integrating data analytical outcomes and key information into clinical workflows.

Desired skills

  • Knowledge of streaming data architectures and technologies (e.g., Kafka, Azure Event Hubs, Kinesis).
  • Knowledge of handing Genome datasets (FASTQ, VCF etc.) and document formats (IHE).
  • General experience working with Gen AI including LLM generated clinical data/summaries

Experience

  • Extensive background as a data engineer, specialising in data warehouse environments and building various types of data pipelines.
  • Demonstrated ability to design and implement data integration and conversion pipelines using ETL/ELT tools, accelerators, and frameworks such as Azure Data Factory, Azure Synapse, Snowflake (Cloud), SSIS, or custom scripts.
  • Skilled in developing reusable ETL frameworks for data processing.
  • Proficient in at least one programming language commonly used for data manipulation and scripting, including Python, PySpark, Java, or Scala.
  • Strong understanding and hands-on experience with DevOps practices and tools, especially Azure DevOps for CI/CD, Git for version control, and Infrastructure as Code.
  • Advanced SQL skills and experience working with relational databases like PostgreSQL, SQL Server, Oracle, and MySQL.
  • Experience implementing solutions on cloud-based data platforms such as Azure, Snowflake, Google Cloud, and related accelerators.
  • Experience with developing and deploying containerised microservices architectures.
  • Understanding of data modelling techniques, including star schema, snowflake schema, and third normal form (3NF).
  • Proven track record with DevOps methodologies in data engineering, including CI/CD and Infrastructure as Code.
  • Knowledge of data governance, data quality, and data security in regulated environments.
  • Experience mapping data from unstructured, semi-structured, and proprietary structured formats within clinical data stores.
  • Strong interpersonal, communication, and documentation abilities, enabling effective collaboration between clinical and technical teams.
  • Experience working in Agile development settings.
  • Outstanding analytical, problem-solving, and communication skills.
  • Ability to work independently as well as collaboratively within a team.

Benefits

  • Collaborative working environment – we stand shoulder to shoulder with our clients and ourpeers through good times and challenges
  • We empower all passionate technology loving professionals by allowing them to expand their skills and take part in inspiring projects
  • ExpleoAcademy - enables you to acquire and develop the right skills by delivering a suite of accredited training courses
  • Competitive company benefits
  • Always working as one team, our people are not afraid to think big and challenge the status quo
  • As a Disability Confident Committed Employer we have committed to:
    • Ensure our recruitment process is inclusive and accessible
    • Communicating and promoting vacancies
    • Offering an interview to disabled people who meet the minimum criteria for the job
    • Anticipating and providing reasonable adjustments as required
    • Supporting any existing employee who acquires a disability or long term health condition, enabling them to stay in work at least one activity that will make a difference for disabled people

“We are an equal opportunities employer and welcome applications from all suitably qualified persons regardless of their race, sex, disability, religion/belief, sexual orientation or age”.

We treat everyone fairly and equitably across the organisation, including providing any additional support and adjustments needed for everyone to thrive

#LI-BM1

#LI-DS1


#J-18808-Ljbffr

Subscribe to Future Tech Insights for the latest jobs & insights, direct to your inbox.

By subscribing, you agree to our privacy policy and terms of service.

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

Pre-Employment Checks for Data Science Jobs: DBS, References & Right-to-Work and more Explained

Pre-employment screening in data science reflects the discipline's unique position at the intersection of statistical analysis, machine learning innovation, and strategic business intelligence. Data scientists often have privileged access to comprehensive datasets, proprietary algorithms, and business-critical insights that form the foundation of organisational strategy and competitive positioning. The data science industry operates within complex regulatory frameworks spanning GDPR, sector-specific data protection requirements, and emerging AI governance regulations. Data scientists must demonstrate not only technical competence in statistical modelling and machine learning but also deep understanding of research ethics, data privacy principles, and the societal implications of algorithmic decision-making. Modern data science roles frequently involve analysing personally identifiable information, financial data, healthcare records, and sensitive business intelligence across multiple jurisdictions and regulatory frameworks simultaneously. The combination of analytical privilege, predictive capabilities, and strategic influence makes thorough candidate verification essential for maintaining compliance, security, and public trust in data-driven insights and automated systems.

Why Now Is the Perfect Time to Launch Your Career in Data Science: The UK's Analytics Revolution

The United Kingdom stands at the forefront of a data science revolution that's reshaping how businesses make decisions, governments craft policies, and society tackles its greatest challenges. From the machine learning algorithms powering London's fintech innovation to the predictive models guiding Manchester's smart city initiatives, Britain's transformation into a data-driven economy has created an unprecedented demand for skilled data scientists that far outstrips the available talent. If you've been contemplating a career transition or seeking to position yourself at the cutting edge of the digital economy, data science represents one of the most intellectually stimulating, financially rewarding, and socially impactful career paths available today. The convergence of big data maturation, artificial intelligence mainstream adoption, business intelligence evolution, and cross-industry digital transformation has created the perfect conditions for data science career success.

Automate Your Data Science Jobs Search: Using ChatGPT, RSS & Alerts to Save Hours Each Week

Data science roles land daily across banks, product companies, consultancies, scaleups & the public sector—often buried in ATS portals or duplicated across boards. The fix: put discovery on rails with keyword-rich alerts, RSS feeds & a reusable ChatGPT workflow that triages listings, ranks fit, & tailors your CV in minutes. This copy-paste playbook is for www.datascience-jobs.co.uk readers. It’s UK-centric, practical, & designed to save you hours each week. What You’ll Have Working In 30 Minutes A role & keyword map spanning Core DS, Applied/Research, Product/Decision Science, NLP/CV, Causal/Experimentation, Time Series/Forecasting, MLOps-adjacent & Analytics Engineering overlaps. Shareable Boolean searches for Google & job boards that strip out noise. Always-on alerts & RSS feeds that bring fresh UK roles to you. A ChatGPT “Data Science Job Scout” prompt that deduplicates, scores match & outputs ready-to-paste actions. A simple pipeline tracker so deadlines & follow-ups never slip.