Staff Production Operations Engineer

Index Exchange
London
3 weeks ago
Create job alert

We shaped the earliest forms of ad tech, and we’re looking for the technical expertise to help shape its future. Our customers have unique problems that can only be solved at internet scale, and that’s where the technical skills of our team make a real difference.

Our exchange handles over 350 billion requests every day (for comparison Google serves an estimated 9 billion searches a day), all running in our own global data centers. Every member of our technology team has an enormous amount of autonomy in building and managing our systems to support and enable our growing level of scale. Through the transparency of our technology, dedication to innovation and integrity, and long-standing customer relationships, we lead through change.

What’s it like to work at Index?

We have more than 550 Indexers around the globe dedicated to building a safe and transparent marketplace that provides a trusted experience for consumers.

Index is an exciting and fast-paced place to work. We’re built on our values of change, support, learning and teaching, trust, and intention. Our diverse and inclusive culture celebrates how we can leverage our unique differences to help drive Index forward.

Our culture of success is truly supportive and collaborative. In working together across our teams, we’re continually investing in the people and technology to solve the industry’s most complex problems. As we extend the promise of ad tech to every channel, we’re looking for talented engineers to help advance Index, and the industry, forward.

Are you ready to join the programmatic evolution?

Index Exchange funds the open web. Content and journalism across the internet are funded through advertising, and we are the engine that helps to make that happen transparently, safely and efficiently. Handling hundreds of billions of auctions per day within milliseconds requires an intense understanding of the exchange and the ecosystem that we live in.

Our business is growing significantly every year and is poised to grow even faster. Our people and our platforms are the foundation and enabler of that growth. We are significantly expanding our technology teams, and are looking for technologists with a passion for high performance software development, and a drive to deliver software products and platforms that enable and empower industries at a global scale.

About the Team:

The global Production Operations group is integral to ensuring the operational stability and reliability of our worldwide 24x7 on-premises and cloud environments. As the first line of defense this team has ownership of operations engineering. Collaborating closely with IT, SRE, Network, and Data engineering teams, and key stakeholders across business, product, and software engineering teams, we play a crucial role in maintaining systems health, responding to incidents, and optimizing the performance, efficiency, and stability of complex global systems.

Here's what you'll be doing:

  • Maintain oversight on internal metrics, including the health, security, and performance of on-premises & hybrid-cloud network and systems infrastructure environments.
  • Execute timely and effective incident response, identifying and mitigating issues to minimize downtime.
  • Respond to alerts within our established SLOs and assist in incident triage, ensuring that the right teams are engaged to address issues promptly.
  • Participate in maintaining system backups, disaster recovery plans, and security protocols are in place and maintained.

Support, Collaboration, and Reporting

  • Serve as a point-of-contact team for operational issues, providing both internal and external teams with technical support and ensuring the issue remains in custody until resolution.
  • Collaborate with product and software engineering teams to relay operational insights and requirements.

Automation, Tooling & Research

  • Continuously identify opportunities for optimization and present findings to technical leads and management.
  • Research and implement improvements enhancing systems performance and scalability.
  • Continuously research and embrace technological advancements and industry best practices to deliver exceptional service.
  • Actively identify and mitigate risks and escalate them so the team can proactively address present or anticipated operational challenges.
  • Develop, implement, and maintain automation frameworks streamlining operational processes, reducing time spent on manual tasks.
  • Identify catalysts for future optimization including provisioning techniques, deployment optimization, ancillary services, pipelines, ansible playbooks, power usage, bandwidth etc.

Documentation and Knowledge Sharing

  • Draft comprehensive documentation for system configurations, processes, and incident resolution procedures.
  • Participate in knowledge sharing within the team and with support provided about the content and delivery, provide cross-training to other relevant departments.
  • Create and maintain runbooks and technical documentation, in addition to being familiar with internal and external escalation pathways.

24x7x365

  • Joining a globally distributed team that maintains coverage 24X7. As a member of this team and broader group, you may be required tooccasionallywork some weekends, holidays, and after hours to respond to high-urgency or emergency events outside of your local time-zone.

Here's what you need:

Technical Expertise

  • In-depth understanding of the Linux operating environment: kernel tuning, network stack tuning, system observability & instrumentation, and security & access management.
  • Solid understanding of layer 2-7 networking fundamentals and the relationship between servers & services, and the transit of their packets through network hardware.
  • In-depth experience engineering and maintaining a private-cloud infrastructure: Bare-metal, vSphere, KVM, Kubernetes.
  • Experience with tools like Ansible, Terraform, Docker, Kafka, Nexus.
  • Experiencing with observability platforms: InfluxDB, Prometheus, ELK, Jaeger, Grafana, Nagios, Zabbix.
  • Familiarity with Big Data tools: Hadoop, HDFS, Spark, HBase.
  • Ability to write code in Go, Python, Bash, or Perl for automation.

Work Experience

  • 5-7+ years of proven experience in previous roles or one of the following roles:
    • DevOps
    • Linux System Administration
    • Site Reliability Engineering
  • Built or maintained a private-cloud infrastructure running centos/rocky linux on a mix of bare-metal, virtualization, and containerization.
  • Managed public cloud environments such as aws, gcp, azure and their federation into on-premise environments.
  • Life-cycle management of baremetal servers such as Dell and Supermicro in globally distributed data centers (e.g. break-fix, baseband/firmware updates).
  • Built or maintained on-premise and cloud Kubernetes clusters: Kubadm, Kind, EKS, GKE.
  • Built or operated automation & orchestration frameworks for deployment & maintenance pipelines: e.g. kafka, stackstorm, ansible, argocd, terraform to push out code or configuration updates, and building new infrastructure systems.

Soft Skills

  • Communication:Clear and effective communication within and across teams.
  • Curiosity:Your curiosity will help drive you to identify and fix the things that go wrong.
  • Alertness:Be vigilant and prepared to respond when things go wrong.
  • Analytical Thinking:Monitor and analyze activity, collaborate with other departments to maintain technical defense.
  • Reliability:Prioritize the reliability of our systems.
  • Continuous Improvement:Embrace a culture of continuous learning and innovation.
  • Customer-Centricity:Committed to providing the best possible experience for our customers.
  • Accountability:Take ownership of our responsibilities.

Why You’ll Love Working Here:

  • Comprehensive health, dental, and vision plans at no cost to you.
  • Time off and flexible work schedules.
  • Retirement plan with a 5% company match.
  • Stock options and equity packages.
  • Generous parental leave.
  • Monthly wellness stipend plus fitness discounts and quarterly wellness group activities.
  • Community engagement opportunities and donation-matching program.
  • Annual virtual company retreats and regular community-led team events.
  • One day off per year to volunteer.
  • A workplace that supports a diverse, equitable, and inclusive environment.

Notification

Index Exchange is aware that there have been recent scams directed toward candidates regarding job interviews and offers. Please be vigilant and do not accept interview requests, job offers, or other hiring-related documents from anyone other than our dedicated recruitment team, from the domain of @indexexchange.com.

Our interview process consists of several steps, including phone screens and video interviews. We do not conduct interviews via an email questionnaire or request money at any point in the process.

We remain dedicated to resolving this matter and we appreciate your support.

Equal employment opportunity

At Index Exchange, we believe that successful products are built by teams just as diverse as the audience who uses them. As such, we are committed to equal employment opportunities. We celebrate diversity of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or expression, or veteran status.

Accessibility for applicants with disabilities

Index Exchange is committed to working with and providing access and reasonable accommodations to applicants with disabilities. Please let us know if you’d like to request a reasonable accommodation.

Index Everywhere, Index Anywhere

Our corporate headquarters are in Toronto, with major offices in New York, Montreal, Kitchener, London, San Francisco, and many other global cities.

#J-18808-Ljbffr

Related Jobs

View all jobs

Senior Data Scientist / Staff Data Scientist

Staff/Principal Software Engineer (Python/Automation)

Head of Platform Data

Contracts Manager

Staff Machine Learning Engineer, Gen AI

Head of Artificial Intelligence

Get the latest insights and jobs direct. Sign up for our newsletter.

By subscribing you agree to our privacy policy and terms of service.

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

10 Essential Books to Read to Nail Your Data Science Career in the UK

Data science continues to be one of the most exciting and rapidly evolving fields in tech. With industries across the UK—ranging from finance and healthcare to e-commerce and government—embracing data-driven decision-making, the demand for skilled data scientists has soared. Whether you're a recent graduate looking for your first role or a professional aiming to advance your career, staying updated through books is crucial. In this article, we explore ten essential books every data science job seeker in the UK should read. Each book provides valuable insights into core concepts, practical applications, and industry-standard tools, helping you build skills employers are actively looking for.

Navigating Data Science Career Fairs Like a Pro: Preparing Your Pitch, Questions to Ask, and Follow-Up Strategies to Stand Out

Data science has taken centre stage in the modern workplace. Organisations rely on data-driven insights to shape everything from product innovation and customer experience to operational efficiency and strategic planning. As a result, there is a growing need for skilled data scientists who can analyse large volumes of data, build predictive models, communicate findings effectively, and collaborate cross-functionally. If you are looking to accelerate your data science career—or even land your first role—attending data science career fairs can be a game-changer. Unlike traditional online applications, face-to-face interactions let you showcase your personality, passion, and communication skills in addition to your technical expertise. However, to stand out in a busy environment, you need a clear strategy: from polishing your personal pitch and asking thoughtful questions to following up with a memorable message. In this article, we’ll guide you through every step of making a strong impression at data science career fairs in the UK and beyond.

Common Pitfalls Data Science Job Seekers Face and How to Avoid Them

Data science has become a linchpin for decision-making and innovation across countless industries, from finance and healthcare to tech and retail. The demand for data scientists in the UK continues to climb, with businesses seeking professionals who can interpret complex datasets, build predictive models, and communicate actionable insights. Despite this high demand, the job market can be extremely competitive—and many applicants unknowingly fall into avoidable traps. Whether you’re an aspiring data scientist fresh out of university, a professional transitioning from a quantitative role, or a seasoned analyst looking to expand your skill set, it’s crucial to navigate your job search effectively. In this article, we explore the most common pitfalls data science job seekers face and provide pragmatic advice to help you stand out. By refining your CV, portfolio, interview strategies, and communication skills, you can significantly increase your chances of landing a rewarding data science role. If you’re looking for your next data science job in the UK, don’t forget to explore the listings at Data Science Jobs. Read on to discover how to avoid critical mistakes and position yourself for success.