Principal Data Scientist

Wellcome Sanger Institute
Hinxton
9 months ago
Applications closed

Related Jobs

View all jobs

Principal Naval Architect (Weights)

Principal Naval Architect (Weights)

Principal Naval Architect (Weights)

Principal Naval Architect (Weights)

Software Architect

Senior Data Science Director, London

Do you want to help us improve human health and understand life on Earth? Make your mark by shaping the future to enable or deliver life-changing science to solve some of humanity’s greatest challenges.

Principal Research Data Scientist

We seek aPrincipal Machine Learning Research Data Scientist Scientistto join a collaborative project between the Wellcome Sanger Institute and Open Targets (targets ( This project aims to leverage datasets internally generated at the Sanger Institute and publicly available data from human cells to create foundational models for biology, enhancing our understanding of life's rules and improving health for all. You will work within an interdisciplinary team of life scientists and computer/ML scientists, with a shared objective of advancing biological research through these foundational models. This role will sit within the AI/ML Faculty group led by Dr. Mohammad Lotfollahi, and the successful candidates, across different seniority levels (senior and principal), will be responsible for delivering their portfolio of scientific research projects as part of the broader team strategy.

About the Role

Your role will involve designing foundational models leveraging multi-modal readouts. This includes integrating and processing data from various sources to develop robust and versatile AI models. To achieve this, you will work with open-source software, proposing, developing, and maintaining new solutions to analyze and interpret large-scale single-cell datasets. We have access to unique data and are also in the position to generate data to train unique models. Additionally, we have substantial computational power and GPU resources to train large models efficiently.

Our teams are well-positioned to tackle this problem with experience in both generating and analyzing datasets, including millions of cells across multiple tissues and conditions (e.g., disease, healthy). This involves a detailed understanding of the training of large-scale ML models and a track record of undertaking large data-science projects.

You will be responsible for:

  • Independently manage and lead machine learningresearchprojects and write outcomes in a scientific publication for submission to journals or machine learning conferences (ICLR, ICML, CVPR, etc).   
  • Collaborate with team members, propose, develop, and evaluate new machine learning models that enable understanding single-cell data and its application in drug discovery.
  •  Work with Ph.D. students and postdocs in collaborating teams on developing solutions for interdisciplinary scientific problems in biology,providing supervision and training to junior members of the team.
  • Contribute to writing scientific papers on biotechnology and biology.
  • Distill your developed solutions into open-source and easy-to-install packages with documentation that facilitates the usage of your solution for downstream users, including biologists and bioinformaticians.
  • Present your research and analysis pipelines to internal and external audiences.

About You:

You will be supported in your personal and professional development and have the opportunity to lead peer-reviewed publications around using genetics and genomics approaches to guide drug discovery and present them at national and international conferences.

Essential Skills:

● Ph.D. or M.Sc. with equivalent research experience in a relevant quantitative discipline (e.g., Computer Science, Computational Biology, Genetics, Bioinformatics, Physics, Engineering, or Applied Statistics/Mathematics)

● Previous ML work experience in scientific/academic environment (RA/Internships are considered as work experience)

● Strong knowledge of Python, including core data science libraries such as Scikit-Learn, SciPy, TensorFlow, and PyTorch.

● Expertise in machine learning algorithms and frameworks, with experience in designing, training, and deploying ML models.

● Proficiency in handling and processing large datasets, including techniques for data cleaning, feature engineering, and data augmentation.

● Experience with high-performance computing environments, including the use of GPUs for training large-scale machine learning models.

● Experience in natural language processing (NLP) and training models based on transformer architectures, such as BERT and GPT.

● Familiarity with generative models such as diffusion models and flow matching.

● Knowledge of software development good practices and collaboration tools, including git-based version control, Python package management, and code reviews.

● Strong problem-solving skills with the ability to analyze complex data and derive actionable insights.

● Excellent communication skills, with the ability to explain complex machine learning algorithms and statistical methods to non-technical stakeholders.

  • Evidence of related work experience as a researcher in the area of Machine learning
  • Strong publication record, first author position ideal

In addition to the above technical skills, you will also have the following:

  • Ability to quickly understand scientific, technical, and process challenges and breakdown complex problems into actionable steps
  • Ability to work in a frequently changing environment with the capability to interpret management information to amend plans
  • Ability to prioritize, manage workload, and deliver agreed activities consistently on time
  • Demonstrate good networking, influencing and relationship building skills
  • Strategic thinking is the ability to see the ‘bigger picture
  • Ability to build collaborative working relationships with internal and external stakeholders at all levels
  • Demonstrates inclusivity and respect for all

Relevant publication of the groups:

  • Lotfollahi, M., Naghipourfar, M., Luecken, M. D., Khajavi, M., Büttner, M., Wagenstetter, M., Avsec, Ž., Gayoso, A., Yosef, N., Interlandi, M. & Others. Mapping single-cell data to reference atlases by transfer learning.Nature Biotechnology1–10 .
  • Lotfollahi, M., Wolf, F. A. & Theis, F. J. scGen predicts single-cell perturbation responses.Nature Methods16, 715–721 .
  • Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A. V. & Theis, F. J. Biologically informed deep learning to query gene programs in single cell atlases.Nature Cell Biology .

Get the latest insights and jobs direct. Sign up for our newsletter.

By subscribing you agree to our privacy policy and terms of service.

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

Tips for Staying Inspired: How Data Science Pros Fuel Creativity and Innovation

Data science sits at the dynamic intersection of statistics, computer science, and domain expertise, driving powerful innovations in industries ranging from healthcare to finance, and from retail to robotics. Yet, the daily reality for many data scientists can be a far cry from starry-eyed talk of AI and machine learning transformations. Instead, it often involves endless data wrangling, model tuning, and scrutiny over metrics. Maintaining a sense of creativity in this environment can be an uphill battle. So, how do successful data scientists continue to dream big and innovate, even when dealing with the nitty-gritty of data pipelines, debugging code, or explaining results to stakeholders? Below, we outline ten practical strategies to help data analysts, machine learning engineers, and research scientists stay inspired and push their ideas further. Whether you’re just starting out or looking to reinvigorate a long-standing career, these pointers can help you find fresh sparks of motivation.

Top 10 Data Science Career Myths Debunked: Key Facts for Aspiring Professionals

Data science has become one of the most sought-after fields in the tech world, promising attractive salaries, cutting-edge projects, and the opportunity to shape decision-making in virtually every industry. From e-commerce recommendation engines to AI-powered medical diagnostics, data scientists are the force behind innovations that drive productivity and improve people’s lives. Yet, despite the demand and glamour often associated with this discipline, data science is also shrouded in misconceptions. Some believe you need a PhD in mathematics or statistics; others assume data science is exclusively about machine learning or coding. At DataScience-Jobs.co.uk, we’ve encountered a wide array of myths that can discourage talented individuals or mislead those exploring a data science career. This article aims to bust the top 10 data science career myths—providing clarity on what data scientists actually do and illuminating the true diversity and inclusiveness of this exciting field. Whether you’re a recent graduate, a professional looking to pivot, or simply curious about data science, read on to discover the reality behind the myths.

Global vs. Local: Comparing the UK Data Science Job Market to International Landscapes

How to evaluate salaries, opportunities, and work culture in data science across the UK, the US, Europe, and Asia Data science has proven to be more than a passing trend; it is now a foundational pillar of modern decision-making in virtually every industry—from healthcare and finance to retail and entertainment. As the volume of data grows exponentially, organisations urgently need professionals who can transform raw information into actionable insights. This high demand has sparked a wave of new opportunities for data scientists worldwide. In this article, we’ll compare the UK data science job market to those in the United States, Europe, and Asia. We’ll explore hiring trends, salary benchmarks, and cultural nuances to help you decide whether to focus your career locally or consider opportunities overseas or in fully remote roles. Whether you’re a fresh graduate looking for your first data science position, an experienced data professional pivoting from analytics, or a software engineer eager to break into machine learning, understanding the global data science landscape can be a game-changer. By the end of this overview, you’ll be better equipped to navigate the expanding world of data science—knowing which skills and certifications matter most, how salaries differ between regions, and what to expect from distinct work cultures. Let’s dive in.