18 Reliability Engineer jobs in Australia
Job No Longer Available
This position is no longer listed on WhatJobs. The employer may be reviewing applications, filled the role, or has removed the listing.
However, we have similar jobs available for you below.
Reliability Engineer - Mechancial

Posted 1 day ago
Job Viewed
Job Description
We work together to transform essential resources into critical ingredients for mobility, energy, connectivity and health. Join our values-led organization committed to building a more resilient world with people and planet in mind. Our core values ( are the foundation that make us successful for ourselves, our customers and the planet.
**Job Description**
Albemarle is hiring for a Mechanical Reliability Engineer. This position is located on site at the Kemerton Processing Plant.
**What You Will Do**
+ Primary lead in the development/review/improvement of maintenance and spares strategies that work to drive the safe reliable operation of production equipment to help deliver required plant performance while helping to reduce the overall costs per tonne.
+ Participates in maintenance planning and with plant knowledge help predict major maintenance requirements, developing KPIs and reporting on them.
+ Assist in the management and implementation of the CMMS architecture in line with site standards and in light of best practice.
+ Provide mechanical engineering support of refinery static and rotating equipment in the form of equipment/material selection, design calculations, fitness for service reviews, repair methods, mechanical design approval, and equipment specification with the aid of and adherence to relevant codes and standards including OSHA 1984, AS 3788, API 579, ASME PCC-2, etc.
+ Maintaining and fostering collaborative effort with all departments that will work to ensure success in the role.
+ Ensure relevant equipment is operated and maintained according to the applicable standards and statutory compliance is adhered to.
+ Lead activities/projects that focus on reliability and cost improvement of poor performing assets.
+ Assist with project implementation to ensure the reliability and maintainability of new and modified installations is in line with plant needs.
+ Participate in the development of design and installation specifications.
+ Support, utilise, and follow the site management of change process when required, to ensure changes are adequately risk assessed and required updates/changes are correctly implemented.
+ Provide input to Risk Management processes, including management of change, which helps to anticipate reliability-related and non-reliability-related risks that could adversely impact plant operation.
**What You Bring**
**Required:**
+ Tertiary level qualification in Mechanical Engineering or other relevant discipline coupled with a minimum of five years of post-graduate reliability and/or maintenance engineering experience.
+ Working understanding of asset management practices and experience in the development effective maintenance strategies.
+ Excellent problem solving, investigative, and data analysis skills.
+ A sound understanding of available NDT and condition monitoring tools and equipment, and how they can be utilised to support reliability improvement and plant management.
+ Experience in the development effective maintenance strategies.
**Preferred:**
+ Post graduate study in best practice RCM techniques and relevant equipment
**Benefits of Joining Albemarle**
+ Competitive compensation
+ Comprehensive benefits package
+ A diverse array of resources to support you professionally and personally.
We are partners to one another in pioneering new ways to be better for ourselves, our teams, and our communities. When you join Albemarle, you become our most essential element and you can anticipate competitive compensation, a comprehensive benefits package, and resources that foster your well-being and fuel your personal growth. Help us shape the future, build with purpose and grow together.
Associate Site Reliability Engineer

Posted 1 day ago
Job Viewed
Job Description
As an Associate Site Reliability Engineer, you will support the reliability, scalability, and performance of our applications in SEAu. This entry-level role is ideal for candidates with foundational experience in software engineering and/or system operations who are eager to grow in a high-impact, collaborative DevSecOps environment.
**What you'll be doing**
Key Responsibilities:
+ Assist in monitoring and maintaining production systems and services.
+ Support incident response efforts and contribute to root cause analysis.
+ Participate in automating deployment, monitoring, and operational tasks.
+ Collaborate with development and QA teams to support new feature rollouts.
+ Contribute to documentation of operational procedures and runbooks.
+ Learn and apply best practices in system reliability, observability, and performance tuning.
**What you bring**
To succeed in the role, you will have:
+ Relevant years of experience in software engineering, DevOps, or IT operations (internships or academic projects acceptable).
+ Familiarity with basic shell scripting.
+ Exposure to cloud platforms (e.g., Azure, AWS)
+ Basic knowledge of programming languages such as Python, Go, or Java.
+ Understanding of CI/CD pipelines and version control systems (e.g., Git).
+ Strong problem-solving and communication skills.
+ Bachelor's degree in Computer Science, Engineering, or a related field.
Desirable:
+ Exposure to monitoring tools (e.g., Dynatrace, Elastic, Grafana).
+ Understanding of networking fundamentals and distributed systems.
+ Interest in automation, self-healing, infrastructure as code, and SRE principles.
**What we offer**
You bring your skills and experience to Shell and in return you work with talented, committed people on one of the most important challenges facing our planet. You'll have the opportunity to develop the skills you need to grow in an environment where we value honesty, integrity, and respect for one another. You'll be able to balance your priorities as you become the best version of yourself.
+ Progress as a person as we work on the energy transition together.
+ Continuously grow the transferable skills you need to get ahead.
+ Work at the forefront of technology, trends, and practices.
+ Collaborate with experienced colleagues with unique expertise.
+ Achieve your balance in a values-led culture that encourages you to be the best version of yourself.
+ Benefit from flexible working hours, and the possibility of remote/mobile working.
+ Perform at your best with a competitive starting salary and annual performance-related salary increase - our pay and benefits packages are considered to be among the best in the world.
+ Take advantage of paid parental leave, including for non-birthing parents.
+ Join an organisation working to become one of the most diverse and inclusive in the world. We strongly encourage applicants of all genders, ages, ethnicities, cultures, abilities, sexual orientation, and life experiences to apply.
+ Grow as you progress through diverse career opportunities in national and
+ international teams.
+ Gain access to a wide range of training and development programmes.
Note: We are keen to support flexible working arrangements, subject to local regulations and legislative frameworks. If this is of interest to you, please describe in your application the type of flexible working arrangements for which you would like to be considered (e.g., part-time, job share).
We'd like you to know that Shell has a bold goal: to become one of the world's most diverse and inclusive companies. You can get to know more about how we're working towards that goal, click here ( .
We are committed to attracting a broader and more diverse pool of candidates. If this position doesn't feel like the perfect fit for your qualifications right now, we'd still love to hear from you. Consider creating a profile in our Talent Community ( so we can keep you in mind for future opportunities that may align with your skills.
**Shell in Australia**
Shell has operated in Australia since 1901. From operating Australia's first oil refinery, which was central to meeting Australia's fuel needs, to fuelling the first Qantas commercial flight in the 1920s, to playing a foundation role in building some of Australia's largest and most innovative natural resource developments.
Throughout this 124-year relationship the needs of our customers and the nation have changed and we have continued transforming our portfolio to meet these needs. Today, we are a leading natural gas producer and are playing our part in the transition to a low-carbon future ( by investing in the power sector, renewable energy sources and carbon abatement activities.
Shell has a significant Liquefied Natural Gas (LNG) business in Australia that makes a valuable contribution to today's energy supply. This integrated gas portfolio includes our two Shell-operated gas production and liquefaction businesses, Shell QGC ( in Queensland and Prelude Floating LNG ( offshore in Western Australia, and our joint venture interests in Gorgon and North West Shelf in Western Australia and Arrow Energy in Queensland.
Today, Shell's portfolio in Australia also includes zero- and low-carbon energy businesses such as commercial and industrial retailer, Shell Energy carbon farming specialist, Select Carbon the 120MW Gangarri solar development; residential energy retailer, Powershop Australia a 49% stake in WestWind Australia a 50% share of Kondinin Energy and several grid-scale Battery Energy Storage Solutions projects. High quality Shell branded fuels and lubricants are available right across Australia, through an exclusive brand license arrangement with Viva Energy. (
Site Reliability Engineer - SPP

Posted 1 day ago
Job Viewed
Job Description
**Do you**
+ know Linux in various levels of diagnostics and troubleshooting?
+ write code to automate repetitive tasks every time you face repetitive work?
+ smile when you solve an issue in Frankfurt from your laptop in Sydney?
Answer 'yes' to these questions and we would like to hear from you. Go ahead, hit the Apply button and let's have a chat about your skills and experiences.
**Want to know more about us?**
Now that we have set the pace, keep reading if you want to understand more about the role and the SRE team. We hope it will be helpful.
**Let's start with the role**
**As a Site Reliability Engineer, you will**
+ Provide relief and sustainable resolution to issues within our infrastructure.
+ Use your experience in software development, systems engineering and networking to proactively prevent repeatable issues.
+ Drive initiatives with partner teams to improve the reliability and performance of the infrastructure through improved system design.
+ Drive a culture of intolerance to manual activity which results in a highly automated environment delivering scalable solutions.
**_Note:_** _This is a full-time position with a four-day workweek. Working hours are from 11:00 PM to 9:00 AM. Weekend shifts are fixed and will be discussed in detail during the interview process._
**This is what we require. Take note because they are a must-have** :
+ Knowledge of Linux systems.
+ Coding experience, we normally prefer Python or JavaScript.
+ Networking skills, IP addressing, routing protocols.
+ Monitoring of systems, applications and networks.
+ Uncompromising attention to detail.
**We also have pluses!**
These are not a 'must', but please highlight them on your resume if you have:
+ Experience in cloud architecture or web applications engineering.
+ Experience in databases performance, replication, high availability.
+ A bachelor's or master's degree in a technical area.
**_Note: Australian Citizenship and the capability to obtain a baseline security clearance is a requirement for this role_** _._
**Now a bit about the SRE team**
The SRE team is a group of highly technical engineers who are tasked with maintaining and developing the reliability, scalability and performance of the ServiceNow infrastructure. The SRE is empowered to drive technical resolutions across the technology stack from hardware through to application and all stops in between. They are also tasked with driving forward the operability of the platform to drive down the number of incidents and to reduce MTTR.
To accomplish this the team combines software development, networking and systems engineering expertise with a strong desire to be challenged by problems of scale and complexity and to make services better for our customers.
**Work Personas**
We approach our distributed world of work with flexibility and trust. Work personas (flexible, remote, or required in office) are categories that are assigned to ServiceNow employees depending on the nature of their work and their assigned work location. Learn more here ( . To determine eligibility for a work persona, ServiceNow may confirm the distance between your primary residence and the closest ServiceNow office using a third-party service.
**Equal Opportunity Employer**
ServiceNow is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, creed, religion, sex, sexual orientation, national origin or nationality, ancestry, age, disability, gender identity or expression, marital status, veteran status, or any other category protected by law. In addition, all qualified applicants with arrest or conviction records will be considered for employment in accordance with legal requirements.
**Accommodations**
We strive to create an accessible and inclusive experience for all candidates. If you require a reasonable accommodation to complete any part of the application process, or are unable to use this online application and need an alternative method to apply, please contact for assistance.
**Export Control Regulations**
For positions requiring access to controlled technology subject to export control regulations, including the U.S. Export Administration Regulations (EAR), ServiceNow may be required to obtain export control approval from government authorities for certain individuals. All employment is contingent upon ServiceNow obtaining any export license or other approval that may be required by relevant export control authorities.
From Fortune. ©2025 Fortune Media IP Limited. All rights reserved. Used under license.
Site Reliability Engineer, Spanner

Posted 1 day ago
Job Viewed
Job Description
Minimum qualifications:
+ Bachelor's degree in Computer Science, a related field, or equivalent practical experience.
+ 1 year of experience in coding in one or more of the following programming languages: C, C++, Java, Python, Go.
+ Experience in optimizing code for stability, functionality and scalability (e.g., crawling, search, troubleshooting).
Preferred qualifications:
+ 1 year of experience in coding in one or more of the following programming languages: C, C++, Java, Python, Go.
+ Experience in one or more of the following: C++, TyperScript, and Go
+ Experience in analyzing and troubleshooting large-scale distributed systems.
+ Ability to manage periodic on-call duty as well as out-of-band requests.
Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services-both our internally critical and our externally-visible systems-have reliability, uptime appropriate to customer's needs and a fast rate of improvement. Additionally SRE's will keep an ever-watchful eye on our systems capacity and performance.
Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you'll have the opportunity to manage the complex challenges of scale which are unique to Google Cloud, while using your expertise in coding, algorithms, complexity analysis and large-scale system design. SRE's culture of intellectual curiosity, problem solving and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow.
+ Manage Spanner SRE and deliver critical projects.
+ Oversee Spanner customers help themselves with debugging and mitigation.
+ Expand Spanner to serve customers in new ways under new conditions and restrictions.
+ Improve the overall Spanner observability.
Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. See also and If you have a need that requires accommodation, please let us know by completing our Accommodations for Applicants form:
Senior Site Reliability Engineer

Posted 1 day ago
Job Viewed
Job Description
25WD88723
**Position Overview**
Do you want the opportunity to be part of a startup environment working on a new product seeking to become a world-leading integration platform? Are you looking to be at the forefront of innovative new technology that will ultimately help people imagine, design, and make a better world? If so, come join the Tandem Connect team at Autodesk! Working with the Tandem team, our mission is to create integration technology and solutions that will transform how buildings are designed, built, and operated.
We are seeking a creative Senior Site Reliability Engineer who has experience building and maintaining scalable, reliable and modern cloud services to join our team today.
**Responsibilities**
+ Maintain a secure, scalable and resilient platform that our customers can trust. This includes the implementation of Autodesk and industry best practices and standards
+ Manage and optimise the security, performance, reliability, and scalability of Kubernetes clusters on Amazon EKS
+ Administer and troubleshoot MongoDB Atlas, AWS MemoryDB (Redis), RabbitMQ on Amazon MQ, and Kafka on Amazon MSK.
+ Design, implement and maintain effective monitoring of the platform and associated components
+ Support other teams with the implementation of their infrastructure requirements
+ Contribute to the design and implement resilient and scalable architectures, including high availability and disaster recovery strategies
+ Provision and manage infrastructure using Terraform, ensuring meticulous configuration management and documentation
+ Set up and maintain monitoring and logging systems, such as Prometheus, Dynatrace, Amazon Cloudwatch and other tools
+ Collaborate with cross-functional teams to resolve complex issues and mentor junior engineers
+ Share your knowledge and learnings with the infrastructure guild
+ Partner closely with the product development, architecture teams and other stakeholders to identify and implement improvements to the product infrastructure and operations
+ Contribute to improvements in processes, tools, and technical methodologies that increase the effectiveness and efficiency of the team in responding to customer and business needs, with an emphasis on having an efficient CI/CD process
+ Provide technical guidance and constructive feedback to team members and stakeholders, which includes writing, reading, and reviewing plans, designs and scripts, and participating in the various technical feedback loops happening within the organisation
+ Contribute to technical product roadmaps
+ On Call support as part of a rostered escalation process
**Minimum Qualifications**
+ BS or MS in computer science, related technology field, or equivalent experience
+ You have at least 7 years of hands-on experience with operating and managing virtual software (with the majority managing containerised workloads) and high traffic customer-facing enterprise solutions in production environments
+ Expertise in defining and managing Kubernetes-based workloads that scale
+ Ability to configure and customize Linux-based operating environments based on application needs
+ Strong understanding of TCP/IP and virtual networking technologies, including Kubernetes Network Policies and AWS Cloudfront
+ Ability to perform automated testing using Cypress
+ Experience with performing live database upgrades
+ Adept at writing and managing Helm and Terraform scripts using GitOps principles
+ Knowledge in integrating password management systems with Infrastructure as Code
+ Proficient in using bash and Python to integrate with network services
+ Extensive experience with creating customized Docker images
+ Extensive experience with DevOps and DevSecOps-based SDLC practices
+ Good understanding of security principles at the network, server, and container levels
+ In-depth understanding of the software development lifecycle (SDLC)
+ Working experience with MongoDB, Redis, Kafka, RabbitMQ, Vault, Consul and equivalent AWS services, including live data migration with minimal downtime
+ Experience with CI/CD and building deployment pipelines using Jenkins and Rundeck.
+ Experience with running load tests and benchmarking tools
+ Strong written and oral communication skills in English
+ Ability to operate effectively and independently in a dynamic, fluid environment
+ Detail-oriented approach to building secure, stable, software
+ Experience with Agile development practices such as Scrum or Kanban
**Preferred Qualifications**
+ Amazon Web Services (AWS) experience.
+ Experience with integration-Platform-as-a-Service (iPaaS) offerings.
+ Ability to read and write in Node.js
+ Experienced with supporting Kubernetes-based MQTT Brokers using the Aedes MQTT software
#LI-CL1
**Learn More**
**About Autodesk**
Welcome to Autodesk! Amazing things are created every day with our software - from the greenest buildings and cleanest cars to the smartest factories and biggest hit movies. We help innovators turn their ideas into reality, transforming not only how things are made, but what can be made.
We take great pride in our culture here at Autodesk - our Culture Code is at the core of everything we do. Our values and ways of working help our people thrive and realize their potential, which leads to even better outcomes for our customers.
When you're an Autodesker, you can be your whole, authentic self and do meaningful work that helps build a better future for all. Ready to shape the world and your future? Join us!
**Salary transparency**
Salary is one part of Autodesk's competitive compensation package. Offers are based on the candidate's experience and geographic location. In addition to base salaries, we also have a significant emphasis on discretionary annual cash bonuses, commissions for sales roles, stock or long-term incentive cash grants, and a comprehensive benefits package.
**Diversity & Belonging**
We take pride in cultivating a culture of belonging and an equitable workplace where everyone can thrive. Learn more here: you an existing contractor or consultant with Autodesk?**
Please search for open jobs and apply internally (not on this external site).
Site Reliability Engineer - SPP
Posted today
Job Viewed
Job Description
== ServiceNow ==
Role Seniority - mid level
More about the Site Reliability Engineer - SPP role at ServiceNow
Company Description
It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today — ServiceNow stands as a global market leader, bringing innovative AI-enhanced technology to over 8,100 customers, including 85% of the Fortune 500®. Our intelligent cloud-based platform seamlessly connects people, systems, and processes to empower organizations to find smarter, faster, and better ways to work. But this is just the beginning of our journey. Join us as we pursue our purpose to make the world work better for everyone.
Job Description
Do you
know Linux in various levels of diagnostics and troubleshooting?
write code to automate repetitive tasks every time you face repetitive work?
smile when you solve an issue in Frankfurt from your laptop in Sydney?
Answer 'yes' to these questions and we would like to hear from you. Go ahead, hit the Apply button and let's have a chat about your skills and experiences.
Want to know more about us?
Now that we have set the pace, keep reading if you want to understand more about the role and the SRE team. We hope it will be helpful.
Let’s start with the role
As a Site Reliability Engineer, you will
Provide relief and sustainable resolution to issues within our infrastructure.
Use your experience in software development, systems engineering and networking to proactively prevent repeatable issues.
Drive initiatives with partner teams to improve the reliability and performance of the infrastructure through improved system design.
Drive a culture of intolerance to manual activity which results in a highly automated environment delivering scalable solutions.
Note: This is a full-time position with a four-day workweek. Working hours are from 11:00 PM to 9:00 AM. Weekend shifts are fixed and will be discussed in detail during the interview process.
Qualifications
This is what we require. Take note because they are a must-have :
Knowledge of Linux systems.
Coding experience, we normally prefer Python or JavaScript.
Networking skills, IP addressing, routing protocols.
Monitoring of systems, applications and networks.
Uncompromising attention to detail.
We also have pluses!
These are not a 'must', but please highlight them on your resume if you have:
Experience in cloud architecture or web applications engineering.
Experience in databases performance, replication, high availability.
A bachelor's or master's degree in a technical area.
Note: Australian Citizenship and the capability to obtain a baseline security clearance is a requirement for this role .
Now a bit about the SRE team
The SRE team is a group of highly technical engineers who are tasked with maintaining and developing the reliability, scalability and performance of the ServiceNow infrastructure. The SRE is empowered to drive technical resolutions across the technology stack from hardware through to application and all stops in between. They are also tasked with driving forward the operability of the platform to drive down the number of incidents and to reduce MTTR.
To accomplish this the team combines software development, networking and systems engineering expertise with a strong desire to be challenged by problems of scale and complexity and to make services better for our customers.
Additional Information
Work Personas
We approach our distributed world of work with flexibility and trust. Work personas (flexible, remote, or required in office) are categories that are assigned to ServiceNow employees depending on the nature of their work and their assigned work location. Learn more here. To determine eligibility for a work persona, ServiceNow may confirm the distance between your primary residence and the closest ServiceNow office using a third-party service.
Equal Opportunity Employer
ServiceNow is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, creed, religion, sex, sexual orientation, national origin or nationality, ancestry, age, disability, gender identity or expression, marital status, veteran status, or any other category protected by law. In addition, all qualified applicants with arrest or conviction records will be considered for employment in accordance with legal requirements.
Accommodations
We strive to create an accessible and inclusive experience for all candidates. If you require a reasonable accommodation to complete any part of the application process, or are unable to use this online application and need an alternative method to apply, please contact for assistance.
Export Control Regulations
For positions requiring access to controlled technology subject to export control regulations, including the U.S. Export Administration Regulations (EAR), ServiceNow may be required to obtain export control approval from government authorities for certain individuals. All employment is contingent upon ServiceNow obtaining any export license or other approval that may be required by relevant export control authorities.
From Fortune. ©2025 Fortune Media IP Limited. All rights reserved. Used under license.
Before we jump into the responsibilities of the role. No matter what you come in knowing, you’ll be learning new things all the time and the ServiceNow team will be there to support your growth.
Please consider applying even if you don't meet 100% of what’s outlined
Key Responsibilities
- Providing sustainable resolutions
- Proactively preventing issues
- Driving initiatives
Key Strengths
- Linux systems
- Coding experience
- Networking skills
- ️ Cloud architecture
- Database performance
- Technical degree
Why ServiceNow is partnering with Hatch on this role. Hatch exists to level the playing field for people as they discover a career that’s right for them. So when you apply you have the chance to show more than just your resume.
A Final Note: This is a role with ServiceNow not with Hatch.
Senior Site Reliability Engineer
Posted today
Job Viewed
Job Description
== Commonwealth Bank ==
Role Seniority - senior
More about the Senior Site Reliability Engineer role at Commonwealth Bank
You are passionate about cutting code and Software engineering practices.
We are undergoing one of Australia’s largest digital transformations
Together we can reimagine banking for millions of customers
Do work that matters
We're building tomorrow’s bank today, which means we need creative and diverse engineers to help us redefine what customers expect from a bank. Envisioning new technologies that are still waiting to be invented and reimagining products that support our customers and help build Australia’s future economy.
See yourself in our team
We’re accelerating our digital strategy with an ambition to provide customers with one of the best digital experiences of any company globally. Site Reliability Engineering (SRE) is key to us achieving this goal. Our teams ensure that our systems maintain the highest standards of service outcomes for our customers, which enables seamless execution of our award winning banking apps.
We're proud of our people and technology culture. Our SRE team marries both, by applying Software Engineering principles to our operational services. We implement latest industry-wide methodologies around observability practices.
We support our people with the flexibility to balance where work is done with at least half your time each month connecting in office. We also have many other flexible working options available including changing start and finish times, part-time arrangements and job share to name a few. Talk to us about how these arrangements might work for you.
We’re interested in hearing from people who
Are passionate about Software Engineering principles to our operational services.
Have experience in software engineering and being hands-on the tools (coding) as part of your day-to-day activities.
Have experience with CI-CD toolset and applied demonstrated cloud experience
Are passionate about driving automation and uplifting application monitoring capability
Is well versed to review code developed by other engineers and provide feedback to ensure best practices (e.g., style guidelines, checking code in, accuracy, testability, and efficiency).
Thrive working in incident response environments, performing post-mortem analysis and designing and implementing secured solutions
Enjoy measuring and optimising system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating for continual improvement
Take ownership of initiatives and assets and follow up to provide highest quality of customer service
Skills
We use a broad range of tools, languages, and frameworks. We don’t expect you to know them all but experience or exposure with some of these (or equivalents) will set you up for success in this team.
Proficient in one Programming language (e.g., Golang, Python)
Any Git tool for source control. Build tools like TeamCity or Jenkins and experience with deployment tooling like Octopus or similar
Proficient in AWS or one cloud offering and understand SRE Practices.
Knowledge and experience of IaC toolset. (e.g., CloudFormation, Terraform)
Worked with observability toolset such as Prometheus, Grafana, AWS CloudWatch, Splunk, AppDynamics,
Good knowledge of OS - Linux and PowerShell scripting, system design thinking
Good Communication and Problem-Solving skills
Working with us:
Whether you’re passionate about customer service, driven by data, or called by creativity, a career with CommBank is for you.
Our people bring their diverse backgrounds and unique perspectives to build a respectful, inclusive, and flexible workplace with flexible work locations. One where we’re driven by our values, and supported to share ideas, initiatives, and energy. One where making a positive impact for customers, communities and each other is part of our every day.
Here, you’ll thrive. You’ll be supported when faced with challenges and empowered to tackle new opportunities. We’re hiring engineers from across Australia and have opened technology hubs in Melbourne and Perth. We really love working here, and we think you will too.
If this sounds like the role for you then we would love to hear from you. Apply today!
If you're already part of the Commonwealth Bank Group (including Bankwest, x15ventures), you'll need to apply through Sidekick to submit a valid application. We’re keen to support you with the next step in your career.
We're aware of some accessibility issues on this site, particularly for screen reader users. We want to make finding your dream job as easy as possible, so if you require additional support please contact HR Direct on 1800 989 696.
Advertising End Date: 29/07/2025
Before we jump into the responsibilities of the role. No matter what you come in knowing, you’ll be learning new things all the time and the Commonwealth Bank team will be there to support your growth.
Please consider applying even if you don't meet 100% of what’s outlined
Key Responsibilities
- ️ Driving automation
- Reviewing code
- Optimising system performance
Key Strengths
- Software engineering principles
- Experience in software engineering
- ️ CI-CD toolset and cloud experience
- Programming languages
- Observability tools
- Problem-solving skills
Why Commonwealth Bank is partnering with Hatch on this role. Hatch exists to level the playing field for people as they discover a career that’s right for them. So when you apply you have the chance to show more than just your resume.
A Final Note: This is a role with Commonwealth Bank not with Hatch.
Be The First To Know
About the latest Reliability engineer Jobs in Australia !
Senior Site Reliability Engineer
Posted today
Job Viewed
Job Description
== Q-CTRL ==
Role Seniority - senior
More about the Senior Site Reliability Engineer role at Q-CTRL
About us
Founded in 2017, Q-CTRL has grown to become the global leader in quantum. We’re using control to solve the hardest problems facing quantum technology, improving hardware performance and accelerating pathways to useful quantum computers and other technologies. As a product-led company, we bring together diverse teams such as product, design, engineering and research to help achieve our mission of making quantum technology useful. Join us to help shape the quantum future.
As one of the fastest growing companies in the quantum sector, we’ve had a number of key milestones:
- In November 2023, we announced an industry-first partnership with IBM Quantum Services, natively integrating our performance management software with all IBM quantum computers. Building off of this relationship, in September 2024 we started offering two services via IBM’s new Qiskit Functions Catalog as an inaugural partner.
- Designed and moved our Global HQ offices and lab space into the first purpose-built (and award winning) commercial and research facility for a quantum technology company in Australia.
- Continued to deliver real world outcomes across the quantum sectors, with our work with Australian Defence on software-ruggedized quantum sensing for navigation without GPS, as featured in the New York Times.
- In October 2024, we announced our record breaking expansion of our Series B funding round to USD $113M, with $59M USD of new capital.
- Grew our global presence to include Los Angeles, Berlin, and Oxford - as well as the recently announced office in San Francisco.
From educating the workforce on how quantum computing works, to building the next generation of quantum sensors, to delivering massive performance gains for end-users, it all starts with hiring the right talent. If you want to help us build the Quantum future, read on.
About the role
The Q-CTRL Platform has grown from a single product with one application that could run on a single container to a growing list of products in a rapidly expanding and highly distributed Kubernetes environment. The Q-CTRL Infrastructure team is expanding to bring on an experienced software developer with a focus on quality, performance testing and an interest in cloud infrastructure. They will focus on ensuring our applications remain available, performant, reliable, scalable with strong inspiration from the tenets of Site Reliability Engineering (SRE).
What you'll be doing:Reporting to the Engineering Manager of the Infrastructure team, you will play a leading role in our quality processes, such as our testing guild, and planning for the Q-CTRL SaaS platform. As a result, engineering managers and their teams will have state-of-the-art performance testing tooling and resources.
Gather insights using monitoring tools and your past experience with microservice-based applications to review, assess and improve Q-CTRL’s platform reliability and performance. You will have the opportunity to push our software to its limits.
Bringing our kubernetes-based testing environment to the next level and becoming a leader supporting continuous improvement of our quality and testing practices within the whole of Engineering.
As a Site Reliability Engineer, you will play a major role in the ongoing development, maintenance and transformation of the observability platform and SRE operations such as traffic forecasting, on-call, incident management and production readiness.
Other duties within the Employee's skills and experience, or with reasonable training.
Honed software engineering skills and familiarity with site reliability engineering, testing practices and production operations.
Experience developing and testing software for distributed microservice applications as well as experience supporting, investigating and resolving issues in production environments using logs, metrics and traces.
Past experience with performance testing tools, such as Grafana, k6 and Locust, which has led to demonstrable improvements in performance and reliability.
Experience in identifying potential problems and performance “pinch points” in software architecture in order to influence design, monitor, test, as well as mentor others.
Worked with continuous improvement and deployment tools such as GitHub Actions or GitLab.
Excellent written and verbal communication skills, with the ability to present complex technical concepts to both technical and non-technical audiences.
A keen eye for improvement and initiative in implementing new technologies and solutions while building things the right way.
Experience with OpenTelemetry monitoring stacks such as Grafana, Mimir, Tempo and Loki. Familiarity with Google's Site Reliability books and relevant insights.
Experience operating Kubernetes in production, managing helm charts and operators.
Knowledge on how to configure public clouds (AWS or other) using infrastructure-as-code.
Familiarity with the CNCF and its various projects. Linkerd, OpenTelemetry, and Prometheus in particular.
Why Q-CTRL?
Flexibility: We embrace workplace flexibility so you worry more about your impact vs a rigid work schedule.
Attractive salary: You’ll get to have the start-up impact without the start-up wages.
Equity: We want people to have a sense of ownership in what they do and offer the potential for equity share and annual bonuses.
Cash bonus: We recognize exceptional performance and impact by offering annual discretionary cash bonuses.
Resources: We are well funded by the world’s best technology investors, letting us chase our ambitions with minimal constraints.
Parental support: We offer paid parental leave to support you and your loved ones.
Diversity: We’re an equal opportunity employer and actively support initiatives like the ‘Global Women in Quantum’ program to help expand the quantum workforce.
Unique culture: You’ll be surrounded by some of the world’s leading physicists, engineers, product, marketing and design people (to name a few!) with a strong desire to learn and transfer knowledge.
Meaningful values: You’ll work with an incredibly supportive team who work consistently to deliver our core values to be real, be trusted, be just and to be revered.
Personal development: We provide you with a personal development and wellness budget.
Make a dent: Last but not least you’ll have the unique opportunity to help set the direction for this revolutionary technology and truly make an impact that matters!
Q-CTRL aims to bring together cross-functional teams from many different backgrounds to help achieve our goals - we strongly encourage you to apply even if you do not meet all of the requirements mentioned in the job posting.
Please be advised that our communications will only come from the @ q-ctrl.com domain. All our active job postings are available on our company website .
To recruitment agencies, we do not accept unsolicited branded profiles and are not responsible for any fees related to unsolicited resumes.
Before we jump into the responsibilities of the role. No matter what you come in knowing, you’ll be learning new things all the time and the Q-CTRL team will be there to support your growth.
Please consider applying even if you don't meet 100% of what’s outlined
Key Responsibilities
- Leading quality processes
- ️ Improving platform reliability
- Enhancing testing environment
Key Strengths
- Software engineering skills
- Microservice application experience
- Performance testing tools
- OpenTelemetry experience
- ️ Kubernetes management
- ️ Infrastructure-as-code knowledge
Why Q-CTRL is partnering with Hatch on this role. Hatch exists to level the playing field for people as they discover a career that’s right for them. So when you apply you have the chance to show more than just your resume.
A Final Note: This is a role with Q-CTRL not with Hatch.
Site Reliability Engineer, Google Play

Posted 1 day ago
Job Viewed
Job Description
Minimum qualifications:
+ Bachelor's degree in Computer Science, a related field, or equivalent practical experience.
+ 5 years of experience in Unix/Linux systems, Internet Protocol networking, performance and application issues.
+ 5 years of experience programming in one or more of the following languages: C, C++, Java, Python, Go, Perl, or Ruby.
+ 5 years of experience in distributed systems or infrastructure designing.
+ 5 years of experience in troubleshooting and debugging distributed systems.
Preferred qualifications:
+ Excellent communication skills.
Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google's services-both our internally critical and our externally-visible systems-have reliability, uptime appropriate to users' needs and a fast rate of improvement. Additionally SRE's will keep an ever-watchful eye on our systems capacity and performance.
Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you'll have the opportunity to manage the complex challenges of scale which are unique to Google, while using your expertise in coding, algorithms, complexity analysis and large-scale system design.
SRE's culture of intellectual curiosity, problem solving and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow.
To learn more: check out our books on Site Reliability Engineering ( or read a career profile ( about why a Software Engineer chose to join SRE.
Behind everything our users see online is the architecture built by the Technical Infrastructure team to keep it running. From developing and maintaining our data centers to building the next generation of Google platforms, we make Google's product portfolio possible. We're proud to be our engineers' engineers and love voiding warranties by taking things apart so we can rebuild them. We keep our networks up and running, ensuring our users have the best and fastest experience possible.
+ Own availability and performance for some of Google Play's key products, and be responsible for ensuring an excellent user experience for global users while supporting change.
+ Oversee production support for Google Play games related services.
+ Design solutions to make the Google Play games related services more resistent to failure.
+ Grow our support to handle the new and evolving product features.
+ Provide tools/training/consultation to development teams taking on new production responsibilities.
Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. See also and If you have a need that requires accommodation, please let us know by completing our Accommodations for Applicants form:
Senior Site Reliability Engineer (Product SRE)
Posted today
Job Viewed
Job Description
== Xero ==
Role Seniority - senior
More about the Senior Site Reliability Engineer (Product SRE) role at Xero
Our Purpose
At Xero, we’re here to help you supercharge your business. We do this by automating routine tasks, surfacing actionable insights and connecting businesses with the right data, advisors and apps. When that happens, we’re not only making life better for small business, we’ll be building a stronger economy that can change the world.
About the team
Xero's Product SRE teams will consist of dedicated world class SRE engineers, embedded into product teams to drive enduring reliability, world class observability, and high performing services.
About the role
This position requires a highly technical Senior Engineer with a strong engineering background, deep experience in SRE and a passion for enabling high performing products.
As a seasoned and relentless engineer, they will contribute to the company's Product SRE strategy and contribute to the ongoing transformation of the Xero SRE culture. As a strong communicator, they'll manage change and ensure the value of robust systems is communicated clearly across the business.
Any experience with reliability concepts such as: capacity management, autoscaling, safe deployment and releases, software strategies for reliability, fault tolerance, and graceful failure would be highly beneficial. Understanding of human factors, safety science, and resilience engineering are also valuable.
What you'll do:
Contribute to the completion of the day to day deliverables of a dedicated product SRE team. These will be highly experienced Site Reliability Engineers with a strong culture of ownership, automation first, and constant quality of delivery.
Build long term relationships with product engineering teams, ensuring everyone can deliver on system reliability with a theme of continuous improvement.
Build a culture of continuous improvement to ensure product reliability is continuously improving and impact of issues are reduced; create and actively monitor quality standards for SRE teams and report regularly on its adherence.
Assist with ongoing training across the business to ensure reliability requirements are well understood and incorporated into product designs.
Participate in a 24/7 global on call roster, focusing on incident response and remediation.
What you'll bring:
Strong software engineering and hands-on SRE background, with experience of leading initiatives in a highly technical team.
Proven experience mentoring engineers in a fast growing company.
Obsessed with delivering a high quality and highly stable customer experience. Passion for customer-first thinking, with a strong product mindset helping to understand and anticipate customer needs.
Broad and deep technical understanding of modern cloud technologies (AWS, Azure, GCP) and their incident and problem management practices, particularly high-growth, high-availability SaaS-based transactional systems.
Proficiency in one or more object-oriented programming languages (C#, JavaScript, Java, Python etc) or experience with infrastructure-as-code (e.g. Terraform, Cloudformation).
Experience using observability tooling to monitor the health of a highly distributed system.
Why Xero?
Offering very generous paid leave to use however you’d like (plus statutory holidays!), dedicated paid leave to care for your physical and mental wellbeing as well as an Employee Assistance Program to access mental health care for you and your family, health insurance, life insurance, and income protection, wellbeing and sports programmes, employee resource groups, 26 weeks of paid parental leave for primary caregivers, an Employee Share Plan, beautiful offices, flexible working, career development, and many other benefits that reflect our human value, you’ll do the best work of your life at Xero.
Before we jump into the responsibilities of the role. No matter what you come in knowing, you’ll be learning new things all the time and the Xero team will be there to support your growth.
Please consider applying even if you don't meet 100% of what’s outlined
Key Responsibilities
- Contributing to daily deliverables
- Building relationships
- Promoting continuous improvement
Key Strengths
- Site Reliability Engineering (SRE)
- Software Engineering
- ️ Cloud Technologies
- Mentoring
- Observability Tooling
- ️ Infrastructure-as-Code
Why Xero is partnering with Hatch on this role. Hatch exists to level the playing field for people as they discover a career that’s right for them. So when you apply you have the chance to show more than just your resume.
A Final Note: This is a role with Xero not with Hatch.