8 Reliability Engineer jobs in Australia
Reliability Engineer - Asset Management
Job Viewed
Job Description
Holistic Asset Management: Seeking Experienced Reliability Engineers
Holistic Asset Management, since it was established in 2016, has rapidly become a benchmark in providing advanced reliability engineering technology and services to asset-intensive industries. Our pursuit of excellence in the application of reliability engineering solutions, development, and optimising of asset maintenance strategies has driven client satisfaction and industry recognition. As we grow, we aim to extend our team and are seeking experienced Reliability Engineers at our new Maroochydore head office on the beautiful Queensland Sunshine Coast.
The Role:
As a Reliability Engineer, you'll play a key role in our team, developing and implementing strategies that enhance our clients' asset reliability and lifespan. You'll directly contribute to enhancing operational efficiency and reducing costs, helping to uphold our high service standards.
Key Responsibilities:
- Design and implement effective reliability strategies and procedures.
- Conduct Failure Modes, Effects & Criticality Analysis (FMECA).
- Lead root cause analysis (RCA) investigations.
- Develop and optimise asset maintenance strategies.
- Provide technical guidance to operations and maintenance teams.
To Be Successful, You Will:
- Hold 5 years of industry experience in a maintenance/reliability role, with the application of reliability methods and techniques gained in mining, processing, oil & gas, utilities, or manufacturing.
- Have experience in data analysis including extraction from various data sources and the ability to summarise data into conclusions.
- Proven experience managing projects, with a demonstrated ability to handle project timelines, deliverables, and budgets.
Why Join Holistic Asset Management?
- Excellent Reputation: Be part of a leading provider known for setting the bar in reliability engineering services and technology.
- Beautiful Location: Our Maroochydore office on the Sunshine Coast offers the perfect work-life balance, with breathtaking beaches, subtropical forests, and a vibrant city life.
- Career Growth: We offer exceptional opportunities for professional development and career advancement in Reliability Engineering with an industry mainstay.
- Collaborative Team: Join an innovative, diverse, and supportive team that embraces collaboration and mutual respect.
- Competitive Compensation: Receive a competitive salary package and a wide array of benefits that reflect our commitment to employee wellbeing.
If you have a passion for improving operational efficiency, enjoy problem-solving, and thrive in a dynamic, rewarding environment, we would love to hear from you.
Apply Today: Shape the future of reliability engineering with Holistic Asset Management.
Please submit your resume along with a cover letter explaining why you would be a great fit for this role.
Holistic Asset Management is an Equal Opportunity Employer.
*Only shortlisted candidates will be contacted*
#J-18808-LjbffrJob No Longer Available
This position is no longer listed on WhatJobs. The employer may be reviewing applications, filled the role, or has removed the listing.
However, we have similar jobs available for you below.
Reliability Engineer - Mechancial

Posted today
Job Viewed
Job Description
We work together to transform essential resources into critical ingredients for mobility, energy, connectivity and health. Join our values-led organization committed to building a more resilient world with people and planet in mind. Our core values ( are the foundation that make us successful for ourselves, our customers and the planet.
**Job Description**
Albemarle is hiring for a Mechanical Reliability Engineer. This position is located on site at the Kemerton Processing Plant.
**What You Will Do**
+ Primary lead in the development/review/improvement of maintenance and spares strategies that work to drive the safe reliable operation of production equipment to help deliver required plant performance while helping to reduce the overall costs per tonne.
+ Participates in maintenance planning and with plant knowledge help predict major maintenance requirements, developing KPIs and reporting on them.
+ Assist in the management and implementation of the CMMS architecture in line with site standards and in light of best practice.
+ Provide mechanical engineering support of refinery static and rotating equipment in the form of equipment/material selection, design calculations, fitness for service reviews, repair methods, mechanical design approval, and equipment specification with the aid of and adherence to relevant codes and standards including OSHA 1984, AS 3788, API 579, ASME PCC-2, etc.
+ Maintaining and fostering collaborative effort with all departments that will work to ensure success in the role.
+ Ensure relevant equipment is operated and maintained according to the applicable standards and statutory compliance is adhered to.
+ Lead activities/projects that focus on reliability and cost improvement of poor performing assets.
+ Assist with project implementation to ensure the reliability and maintainability of new and modified installations is in line with plant needs.
+ Participate in the development of design and installation specifications.
+ Support, utilise, and follow the site management of change process when required, to ensure changes are adequately risk assessed and required updates/changes are correctly implemented.
+ Provide input to Risk Management processes, including management of change, which helps to anticipate reliability-related and non-reliability-related risks that could adversely impact plant operation.
**What You Bring**
**Required:**
+ Tertiary level qualification in Mechanical Engineering or other relevant discipline coupled with a minimum of five years of post-graduate reliability and/or maintenance engineering experience.
+ Working understanding of asset management practices and experience in the development effective maintenance strategies.
+ Excellent problem solving, investigative, and data analysis skills.
+ A sound understanding of available NDT and condition monitoring tools and equipment, and how they can be utilised to support reliability improvement and plant management.
+ Experience in the development effective maintenance strategies.
**Preferred:**
+ Post graduate study in best practice RCM techniques and relevant equipment
**Benefits of Joining Albemarle**
+ Competitive compensation
+ Comprehensive benefits package
+ A diverse array of resources to support you professionally and personally.
We are partners to one another in pioneering new ways to be better for ourselves, our teams, and our communities. When you join Albemarle, you become our most essential element and you can anticipate competitive compensation, a comprehensive benefits package, and resources that foster your well-being and fuel your personal growth. Help us shape the future, build with purpose and grow together.
Associate Site Reliability Engineer

Posted today
Job Viewed
Job Description
As an Associate Site Reliability Engineer, you will support the reliability, scalability, and performance of our applications in SEAu. This entry-level role is ideal for candidates with foundational experience in software engineering and/or system operations who are eager to grow in a high-impact, collaborative DevSecOps environment.
**What you'll be doing**
Key Responsibilities:
+ Assist in monitoring and maintaining production systems and services.
+ Support incident response efforts and contribute to root cause analysis.
+ Participate in automating deployment, monitoring, and operational tasks.
+ Collaborate with development and QA teams to support new feature rollouts.
+ Contribute to documentation of operational procedures and runbooks.
+ Learn and apply best practices in system reliability, observability, and performance tuning.
**What you bring**
To succeed in the role, you will have:
+ Relevant years of experience in software engineering, DevOps, or IT operations (internships or academic projects acceptable).
+ Familiarity with basic shell scripting.
+ Exposure to cloud platforms (e.g., Azure, AWS)
+ Basic knowledge of programming languages such as Python, Go, or Java.
+ Understanding of CI/CD pipelines and version control systems (e.g., Git).
+ Strong problem-solving and communication skills.
+ Bachelor's degree in Computer Science, Engineering, or a related field.
Desirable:
+ Exposure to monitoring tools (e.g., Dynatrace, Elastic, Grafana).
+ Understanding of networking fundamentals and distributed systems.
+ Interest in automation, self-healing, infrastructure as code, and SRE principles.
**What we offer**
You bring your skills and experience to Shell and in return you work with talented, committed people on one of the most important challenges facing our planet. You'll have the opportunity to develop the skills you need to grow in an environment where we value honesty, integrity, and respect for one another. You'll be able to balance your priorities as you become the best version of yourself.
+ Progress as a person as we work on the energy transition together.
+ Continuously grow the transferable skills you need to get ahead.
+ Work at the forefront of technology, trends, and practices.
+ Collaborate with experienced colleagues with unique expertise.
+ Achieve your balance in a values-led culture that encourages you to be the best version of yourself.
+ Benefit from flexible working hours, and the possibility of remote/mobile working.
+ Perform at your best with a competitive starting salary and annual performance-related salary increase - our pay and benefits packages are considered to be among the best in the world.
+ Take advantage of paid parental leave, including for non-birthing parents.
+ Join an organisation working to become one of the most diverse and inclusive in the world. We strongly encourage applicants of all genders, ages, ethnicities, cultures, abilities, sexual orientation, and life experiences to apply.
+ Grow as you progress through diverse career opportunities in national and
+ international teams.
+ Gain access to a wide range of training and development programmes.
Note: We are keen to support flexible working arrangements, subject to local regulations and legislative frameworks. If this is of interest to you, please describe in your application the type of flexible working arrangements for which you would like to be considered (e.g., part-time, job share).
We'd like you to know that Shell has a bold goal: to become one of the world's most diverse and inclusive companies. You can get to know more about how we're working towards that goal, click here ( .
We are committed to attracting a broader and more diverse pool of candidates. If this position doesn't feel like the perfect fit for your qualifications right now, we'd still love to hear from you. Consider creating a profile in our Talent Community ( so we can keep you in mind for future opportunities that may align with your skills.
**Shell in Australia**
Shell has operated in Australia since 1901. From operating Australia's first oil refinery, which was central to meeting Australia's fuel needs, to fuelling the first Qantas commercial flight in the 1920s, to playing a foundation role in building some of Australia's largest and most innovative natural resource developments.
Throughout this 124-year relationship the needs of our customers and the nation have changed and we have continued transforming our portfolio to meet these needs. Today, we are a leading natural gas producer and are playing our part in the transition to a low-carbon future ( by investing in the power sector, renewable energy sources and carbon abatement activities.
Shell has a significant Liquefied Natural Gas (LNG) business in Australia that makes a valuable contribution to today's energy supply. This integrated gas portfolio includes our two Shell-operated gas production and liquefaction businesses, Shell QGC ( in Queensland and Prelude Floating LNG ( offshore in Western Australia, and our joint venture interests in Gorgon and North West Shelf in Western Australia and Arrow Energy in Queensland.
Today, Shell's portfolio in Australia also includes zero- and low-carbon energy businesses such as commercial and industrial retailer, Shell Energy carbon farming specialist, Select Carbon the 120MW Gangarri solar development; residential energy retailer, Powershop Australia a 49% stake in WestWind Australia a 50% share of Kondinin Energy and several grid-scale Battery Energy Storage Solutions projects. High quality Shell branded fuels and lubricants are available right across Australia, through an exclusive brand license arrangement with Viva Energy. (
Site Reliability Engineer - SPP

Posted today
Job Viewed
Job Description
**Do you**
+ know Linux in various levels of diagnostics and troubleshooting?
+ write code to automate repetitive tasks every time you face repetitive work?
+ smile when you solve an issue in Frankfurt from your laptop in Sydney?
Answer 'yes' to these questions and we would like to hear from you. Go ahead, hit the Apply button and let's have a chat about your skills and experiences.
**Want to know more about us?**
Now that we have set the pace, keep reading if you want to understand more about the role and the SRE team. We hope it will be helpful.
**Let's start with the role**
**As a Site Reliability Engineer, you will**
+ Provide relief and sustainable resolution to issues within our infrastructure.
+ Use your experience in software development, systems engineering and networking to proactively prevent repeatable issues.
+ Drive initiatives with partner teams to improve the reliability and performance of the infrastructure through improved system design.
+ Drive a culture of intolerance to manual activity which results in a highly automated environment delivering scalable solutions.
**_Note:_** _This is a full-time position with a four-day workweek. Working hours are from 11:00 PM to 9:00 AM. Weekend shifts are fixed and will be discussed in detail during the interview process._
**This is what we require. Take note because they are a must-have** :
+ Knowledge of Linux systems.
+ Coding experience, we normally prefer Python or JavaScript.
+ Networking skills, IP addressing, routing protocols.
+ Monitoring of systems, applications and networks.
+ Uncompromising attention to detail.
**We also have pluses!**
These are not a 'must', but please highlight them on your resume if you have:
+ Experience in cloud architecture or web applications engineering.
+ Experience in databases performance, replication, high availability.
+ A bachelor's or master's degree in a technical area.
**_Note: Australian Citizenship and the capability to obtain a baseline security clearance is a requirement for this role_** _._
**Now a bit about the SRE team**
The SRE team is a group of highly technical engineers who are tasked with maintaining and developing the reliability, scalability and performance of the ServiceNow infrastructure. The SRE is empowered to drive technical resolutions across the technology stack from hardware through to application and all stops in between. They are also tasked with driving forward the operability of the platform to drive down the number of incidents and to reduce MTTR.
To accomplish this the team combines software development, networking and systems engineering expertise with a strong desire to be challenged by problems of scale and complexity and to make services better for our customers.
**Work Personas**
We approach our distributed world of work with flexibility and trust. Work personas (flexible, remote, or required in office) are categories that are assigned to ServiceNow employees depending on the nature of their work and their assigned work location. Learn more here ( . To determine eligibility for a work persona, ServiceNow may confirm the distance between your primary residence and the closest ServiceNow office using a third-party service.
**Equal Opportunity Employer**
ServiceNow is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, creed, religion, sex, sexual orientation, national origin or nationality, ancestry, age, disability, gender identity or expression, marital status, veteran status, or any other category protected by law. In addition, all qualified applicants with arrest or conviction records will be considered for employment in accordance with legal requirements.
**Accommodations**
We strive to create an accessible and inclusive experience for all candidates. If you require a reasonable accommodation to complete any part of the application process, or are unable to use this online application and need an alternative method to apply, please contact for assistance.
**Export Control Regulations**
For positions requiring access to controlled technology subject to export control regulations, including the U.S. Export Administration Regulations (EAR), ServiceNow may be required to obtain export control approval from government authorities for certain individuals. All employment is contingent upon ServiceNow obtaining any export license or other approval that may be required by relevant export control authorities.
From Fortune. ©2025 Fortune Media IP Limited. All rights reserved. Used under license.
Site Reliability Engineer, Spanner

Posted today
Job Viewed
Job Description
Minimum qualifications:
+ Bachelor's degree in Computer Science, a related field, or equivalent practical experience.
+ 1 year of experience in coding in one or more of the following programming languages: C, C++, Java, Python, Go.
+ Experience in optimizing code for stability, functionality and scalability (e.g., crawling, search, troubleshooting).
Preferred qualifications:
+ 1 year of experience in coding in one or more of the following programming languages: C, C++, Java, Python, Go.
+ Experience in one or more of the following: C++, TyperScript, and Go
+ Experience in analyzing and troubleshooting large-scale distributed systems.
+ Ability to manage periodic on-call duty as well as out-of-band requests.
Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services-both our internally critical and our externally-visible systems-have reliability, uptime appropriate to customer's needs and a fast rate of improvement. Additionally SRE's will keep an ever-watchful eye on our systems capacity and performance.
Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you'll have the opportunity to manage the complex challenges of scale which are unique to Google Cloud, while using your expertise in coding, algorithms, complexity analysis and large-scale system design. SRE's culture of intellectual curiosity, problem solving and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow.
+ Manage Spanner SRE and deliver critical projects.
+ Oversee Spanner customers help themselves with debugging and mitigation.
+ Expand Spanner to serve customers in new ways under new conditions and restrictions.
+ Improve the overall Spanner observability.
Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. See also and If you have a need that requires accommodation, please let us know by completing our Accommodations for Applicants form:
Senior Site Reliability Engineer

Posted today
Job Viewed
Job Description
25WD88723
**Position Overview**
Do you want the opportunity to be part of a startup environment working on a new product seeking to become a world-leading integration platform? Are you looking to be at the forefront of innovative new technology that will ultimately help people imagine, design, and make a better world? If so, come join the Tandem Connect team at Autodesk! Working with the Tandem team, our mission is to create integration technology and solutions that will transform how buildings are designed, built, and operated.
We are seeking a creative Senior Site Reliability Engineer who has experience building and maintaining scalable, reliable and modern cloud services to join our team today.
**Responsibilities**
+ Maintain a secure, scalable and resilient platform that our customers can trust. This includes the implementation of Autodesk and industry best practices and standards
+ Manage and optimise the security, performance, reliability, and scalability of Kubernetes clusters on Amazon EKS
+ Administer and troubleshoot MongoDB Atlas, AWS MemoryDB (Redis), RabbitMQ on Amazon MQ, and Kafka on Amazon MSK.
+ Design, implement and maintain effective monitoring of the platform and associated components
+ Support other teams with the implementation of their infrastructure requirements
+ Contribute to the design and implement resilient and scalable architectures, including high availability and disaster recovery strategies
+ Provision and manage infrastructure using Terraform, ensuring meticulous configuration management and documentation
+ Set up and maintain monitoring and logging systems, such as Prometheus, Dynatrace, Amazon Cloudwatch and other tools
+ Collaborate with cross-functional teams to resolve complex issues and mentor junior engineers
+ Share your knowledge and learnings with the infrastructure guild
+ Partner closely with the product development, architecture teams and other stakeholders to identify and implement improvements to the product infrastructure and operations
+ Contribute to improvements in processes, tools, and technical methodologies that increase the effectiveness and efficiency of the team in responding to customer and business needs, with an emphasis on having an efficient CI/CD process
+ Provide technical guidance and constructive feedback to team members and stakeholders, which includes writing, reading, and reviewing plans, designs and scripts, and participating in the various technical feedback loops happening within the organisation
+ Contribute to technical product roadmaps
+ On Call support as part of a rostered escalation process
**Minimum Qualifications**
+ BS or MS in computer science, related technology field, or equivalent experience
+ You have at least 7 years of hands-on experience with operating and managing virtual software (with the majority managing containerised workloads) and high traffic customer-facing enterprise solutions in production environments
+ Expertise in defining and managing Kubernetes-based workloads that scale
+ Ability to configure and customize Linux-based operating environments based on application needs
+ Strong understanding of TCP/IP and virtual networking technologies, including Kubernetes Network Policies and AWS Cloudfront
+ Ability to perform automated testing using Cypress
+ Experience with performing live database upgrades
+ Adept at writing and managing Helm and Terraform scripts using GitOps principles
+ Knowledge in integrating password management systems with Infrastructure as Code
+ Proficient in using bash and Python to integrate with network services
+ Extensive experience with creating customized Docker images
+ Extensive experience with DevOps and DevSecOps-based SDLC practices
+ Good understanding of security principles at the network, server, and container levels
+ In-depth understanding of the software development lifecycle (SDLC)
+ Working experience with MongoDB, Redis, Kafka, RabbitMQ, Vault, Consul and equivalent AWS services, including live data migration with minimal downtime
+ Experience with CI/CD and building deployment pipelines using Jenkins and Rundeck.
+ Experience with running load tests and benchmarking tools
+ Strong written and oral communication skills in English
+ Ability to operate effectively and independently in a dynamic, fluid environment
+ Detail-oriented approach to building secure, stable, software
+ Experience with Agile development practices such as Scrum or Kanban
**Preferred Qualifications**
+ Amazon Web Services (AWS) experience.
+ Experience with integration-Platform-as-a-Service (iPaaS) offerings.
+ Ability to read and write in Node.js
+ Experienced with supporting Kubernetes-based MQTT Brokers using the Aedes MQTT software
#LI-CL1
**Learn More**
**About Autodesk**
Welcome to Autodesk! Amazing things are created every day with our software - from the greenest buildings and cleanest cars to the smartest factories and biggest hit movies. We help innovators turn their ideas into reality, transforming not only how things are made, but what can be made.
We take great pride in our culture here at Autodesk - our Culture Code is at the core of everything we do. Our values and ways of working help our people thrive and realize their potential, which leads to even better outcomes for our customers.
When you're an Autodesker, you can be your whole, authentic self and do meaningful work that helps build a better future for all. Ready to shape the world and your future? Join us!
**Salary transparency**
Salary is one part of Autodesk's competitive compensation package. Offers are based on the candidate's experience and geographic location. In addition to base salaries, we also have a significant emphasis on discretionary annual cash bonuses, commissions for sales roles, stock or long-term incentive cash grants, and a comprehensive benefits package.
**Diversity & Belonging**
We take pride in cultivating a culture of belonging and an equitable workplace where everyone can thrive. Learn more here: you an existing contractor or consultant with Autodesk?**
Please search for open jobs and apply internally (not on this external site).
Site Reliability Engineer, Google Play

Posted today
Job Viewed
Job Description
Minimum qualifications:
+ Bachelor's degree in Computer Science, a related field, or equivalent practical experience.
+ 5 years of experience in Unix/Linux systems, Internet Protocol networking, performance and application issues.
+ 5 years of experience programming in one or more of the following languages: C, C++, Java, Python, Go, Perl, or Ruby.
+ 5 years of experience in distributed systems or infrastructure designing.
+ 5 years of experience in troubleshooting and debugging distributed systems.
Preferred qualifications:
+ Excellent communication skills.
Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google's services-both our internally critical and our externally-visible systems-have reliability, uptime appropriate to users' needs and a fast rate of improvement. Additionally SRE's will keep an ever-watchful eye on our systems capacity and performance.
Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you'll have the opportunity to manage the complex challenges of scale which are unique to Google, while using your expertise in coding, algorithms, complexity analysis and large-scale system design.
SRE's culture of intellectual curiosity, problem solving and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow.
To learn more: check out our books on Site Reliability Engineering ( or read a career profile ( about why a Software Engineer chose to join SRE.
Behind everything our users see online is the architecture built by the Technical Infrastructure team to keep it running. From developing and maintaining our data centers to building the next generation of Google platforms, we make Google's product portfolio possible. We're proud to be our engineers' engineers and love voiding warranties by taking things apart so we can rebuild them. We keep our networks up and running, ensuring our users have the best and fastest experience possible.
+ Own availability and performance for some of Google Play's key products, and be responsible for ensuring an excellent user experience for global users while supporting change.
+ Oversee production support for Google Play games related services.
+ Design solutions to make the Google Play games related services more resistent to failure.
+ Grow our support to handle the new and evolving product features.
+ Provide tools/training/consultation to development teams taking on new production responsibilities.
Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. See also and If you have a need that requires accommodation, please let us know by completing our Accommodations for Applicants form:
Site Reliability Engineer, Enterprise Cloud Platforms, Global Technology, Australia

Posted today
Job Viewed
Job Description
Sydney, Australia
**To proceed with your application, you must be at least 18 years of age.**
Acknowledge
Refer a friend
**To proceed with your application, you must be at least 18 years of age.**
Acknowledge ( Description:**
At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. We do this by driving Responsible Growth and delivering for our clients, teammates, communities and shareholders every day.
Being a Great Place to Work is core to how we drive Responsible Growth. This includes our commitment to being a diverse and inclusive workplace, attracting and developing exceptional talent, supporting our teammates' physical, emotional, and financial wellness, recognizing and rewarding performance, and how we make an impact in the communities we serve.
At Bank of America, you can build a successful career with opportunities to learn, grow, and make an impact. Join us!
**Enterprise Cloud Platforms Team:**
Our team designs, builds, and maintains Public Cloud platforms for Bank of America's. We provide our customers an innovative platform with bult-in integrations that allow for a faster time-to-market with reduced complexity. We believe in a high-quality engineering culture, a customer focused mindset, and building for scale and resiliency. As part of this team, you will have a large impact on the evolution of next generation Cloud services for Bank of America and explore an extensive list of new technologies that will drive innovation across our company.
We are seeking Site Reliability Engineers (SREs) to design, build, and maintain our next-gen platforms. The role provides opportunity to work with wide range of technologies and build a unique perspective that comes with integrating disparate services (both on-prem/off-prem) which must interact seamlessly with each other. You will work with colleagues that are fun, smart, hardworking, and driven. You will be part of a global team that is growing, giving you room to innovate and be creative.
**Position Summary**
+ Collaborates with a diverse set of engineers, architects, and teams to design, develop, test, and implement secure, robust, highly available and scalable solutions for BofA's External Cloud Platform
+ Collaborates other software engineers and teams to design and implement deployment approaches using highly scalable, automated, continuous integration and continuous delivery pipelines.
+ Responsible for all aspects of reliability, collaborates with technical experts, key stakeholders, and team members to resolve complex problems, owning the issue until you are sure it will not reoccur.
+ Deep understanding of SRE practices, service level indicators, and service level objectives; proactively utilize them to resolve issues before they impact customers.
+ Gather, analyze, synthesize, and develop visualizations and reporting from large, diverse data sets in service of continuous improvement of the platform.
+ Implement infrastructure, configuration, and network as code for the applications and platforms in your remit.
+ Identify opportunities to eliminate toil and automate the triage of issues to improve overall operational stability.
+ Collaborate with a global team to identify, analyze, and resolve platform vulnerabilities.
+ Proactively promotes the adoption of site reliability engineering best practices within the team and organization.
+ Participate in 24x7 on-call coverage follow the sun model and performs blameless Postmortems (RCAs) as needed.
**Required Skills:**
+ 7 years of combined experience in either SRE, software development, or infrastructure engineering (4 years with an advanced degree in Computer Science or related technical field).
+ 3+ years of hands-on experience building and maintaining cloud platforms on a major cloud service provider.
+ Strong experience in implementing, monitoring, and maintaining a highly scalable and resilient Data Services platform on major CSP's like AWS, Azure or GCP.
+ Strong experience with monitoring tools such as Grafana, Prometheus, Splunk, or Dynatrace, as well as cloud native tools like CloudWatch & CloudTrail, Azure Monitor and Log Analytics
+ Proficiency in implementing, monitoring, and maintaining a Databricks, RDS, or OpenAI platform.
+ Proficient in at least one programming language such as Python, Java/Spring Boot, and .Net; 5+ years applied experience in Python/Java
+ Proficiency in implementing CI/CD pipelines with tools such as git and Jenkins, familiarity with using a GitOps model.
+ Advanced knowledge of networking (firewalls, DNS, Load Balancing, Proxies, etc.)
+ Advanced understanding of Linux & Windows operating systems including shell scripting
+ Excellent interpersonal, organizational and communication (written, verbal, and presentation) skills are a must.
+ Proven ability to work independently with minimal supervision and as part of a global team with direct responsibilities and an ability to juggle competing priorities and adapt to changes in project scope.
**Desired Skills**
+ Strong experience working with a complex IAM infrastructure, including Active Directory, Azure AD Connect, Azure AD, and PingIdentity, Okta, or other SSO solutions.
+ Proficiency in creating automation using Python, Terraform, or Ansible
+ Proficiency in implementing, monitoring, and maintaining a Databricks, CosmosDB, or OpenAI platform.
+ Experience in implementing, monitoring, and maintaining a highly scalable and resilient enterprise platform on Microsoft Azure using native services related to compute, storage, networking, security, and observability.
+ Experience with containerization technologies such as EC2, EKS, Fargate, Openshift, or Kubernetes.
+ Understanding of cost management, inventory management, FinOps model
Bank of America and its affiliates consider for employment and hire qualified candidates without regard to race, religious creed, religion, color, sex, sexual orientation, genetic information, gender, gender identity, gender expression, age, national origin, ancestry, citizenship, protected veteran or disability status or any factor prohibited by law, and as such affirms in policy and practice to support and promote the concept of equal employment opportunity, in accordance with all applicable federal, state, provincial and municipal laws. The company also prohibits discrimination on other bases such as medical condition, marital status or any other factor that is irrelevant to the performance of our teammates.
To view the "Know your Rights" poster, CLICK HERE ( .
View the LA County Fair Chance Ordinance ( .
Bank of America aims to create a workplace free from the dangers and resulting consequences of illegal and illicit drug use and alcohol abuse. Our Drug-Free Workplace and Alcohol Policy ("Policy") establishes requirements to prevent the presence or use of illegal or illicit drugs or unauthorized alcohol on Bank of America premises and to provide a safe work environment.
To view Bank of America's Drug-free Workplace and Alcohol Policy, CLICK HERE .
Bank of America is committed to an in-office culture with specific requirements for office-based attendance and which allows for an appropriate level of flexibility for our teammates and businesses based on role-specific considerations. Should you be offered a role with Bank of America, your hiring manager will provide you with information on the in-office expectations associated with your role. These expectations are subject to change at any time and at the sole discretion of the Company. To the extent you have a disability or sincerely held religious belief for which you believe you need a reasonable accommodation from this requirement, you must seek an accommodation through the Bank's required accommodation request process before your first day of work.
This communication provides information about certain Bank of America benefits. Receipt of this document does not automatically entitle you to benefits offered by Bank of America. Every effort has been made to ensure the accuracy of this communication. However, if there are discrepancies between this communication and the official plan documents, the plan documents will always govern. Bank of America retains the discretion to interpret the terms or language used in any of its communications according to the provisions contained in the plan documents. Bank of America also reserves the right to amend or terminate any benefit plan in its sole discretion at any time for any reason.
Be The First To Know
About the latest Reliability engineer Jobs in Australia !
Senior Site Reliability Engineer, Enterprise Cloud Platforms, Global Technology, Australia

Posted today
Job Viewed
Job Description
Sydney, Australia
**To proceed with your application, you must be at least 18 years of age.**
Acknowledge
Refer a friend
**To proceed with your application, you must be at least 18 years of age.**
Acknowledge ( Description:**
At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. We do this by driving Responsible Growth and delivering for our clients, teammates, communities and shareholders every day.
Being a Great Place to Work is core to how we drive Responsible Growth. This includes our commitment to being a diverse and inclusive workplace, attracting and developing exceptional talent, supporting our teammates' physical, emotional, and financial wellness, recognizing and rewarding performance, and how we make an impact in the communities we serve.
At Bank of America, you can build a successful career with opportunities to learn, grow, and make an impact. Join us!
**Enterprise Cloud Platforms Team:**
Our team designs, builds, and maintains Public Cloud platforms for Bank of America's. We provide our customers an innovative platform with bult-in integrations that allow for a faster time-to-market with reduced complexity. We believe in a high-quality engineering culture, a customer focused mindset, and building for scale and resiliency. As part of this team, you will have a large impact on the evolution of next generation Cloud services for Bank of America and explore an extensive list of new technologies that will drive innovation across our company.
We are seeking Senior Site Reliability Engineers (SREs) to design, build, and maintain our next-gen platforms. The role provides opportunity to work with wide range of technologies and build a unique perspective that comes with integrating disparate services (both on-prem/off-prem) which must interact seamlessly with each other. You will work with colleagues that are fun, smart, hardworking, and driven. You will be part of a global team that is growing, giving you room to innovate and be creative.
**Position Summary**
+ Collaborates with a diverse set of engineers, architects, and teams to design, develop, test, and implement secure, robust, highly available and scalable solutions for BofA's External Cloud Platform
+ Collaborates other software engineers and teams to design and implement deployment approaches using highly scalable, automated, continuous integration and continuous delivery pipelines.
+ Responsible for all aspects of reliability, collaborates with technical experts, key stakeholders, and team members to resolve complex problems, owning the issue until you are sure it will not reoccur.
+ Deep understanding of SRE practices, service level indicators, and service level objectives; proactively utilize them to resolve issues before they impact customers.
+ Gather, analyze, synthesize, and develop visualizations and reporting from large, diverse data sets in service of continuous improvement of the platform.
+ Implement infrastructure, configuration, and network as code for the applications and platforms in your remit.
+ Identify opportunities to eliminate toil and automate the triage of issues to improve overall operational stability.
+ Collaborate with a global team to identify, analyze, and resolve platform vulnerabilities.
+ Proactively promotes the adoption of site reliability engineering best practices within the team and organization.
+ Participate in 24x7 on-call coverage follow the sun model and performs blameless Postmortems (RCAs) as needed.
**Required Skills:**
+ 15 years of combined experience in either SRE, software development, or infrastructure engineering (10 years with an advanced degree in Computer Science or related technical field).
+ 7+ years of hands-on experience building and maintaining cloud platforms on a major cloud service provider.
+ Strong experience in implementing, monitoring, and maintaining a highly scalable and resilient Data Services platform on major CSP's like AWS, Azure or GCP.
+ Strong experience with monitoring tools such as Grafana, Prometheus, Splunk, or Dynatrace, as well as cloud native tools like CloudWatch & CloudTrail, Azure Monitor and Log Analytics
+ Proficiency in implementing, monitoring, and maintaining a Databricks, RDS, or OpenAI platform.
+ Proficient in at least one programming language such as Python, Java/Spring Boot, and .Net; 5+ years applied experience in Python/Java
+ Proficiency in implementing CI/CD pipelines with tools such as git and Jenkins, familiarity with using a GitOps model.
+ Advanced knowledge of networking (firewalls, DNS, Load Balancing, Proxies, etc.)
+ Advanced understanding of Linux & Windows operating systems including shell scripting
+ Excellent interpersonal, organizational and communication (written, verbal, and presentation) skills are a must.
+ Proven ability to work independently with minimal supervision and as part of a global team with direct responsibilities and an ability to juggle competing priorities and adapt to changes in project scope.
**Desired Skills**
+ Strong experience working with a complex IAM infrastructure, including Active Directory, Azure AD Connect, Azure AD, and PingIdentity, Okta, or other SSO solutions.
+ Proficiency in creating automation using Python, Terraform, or Ansible
+ Proficiency in implementing, monitoring, and maintaining a Databricks, CosmosDB, or OpenAI platform.
+ Experience in implementing, monitoring, and maintaining a highly scalable and resilient enterprise platform on Microsoft Azure using native services related to compute, storage, networking, security, and observability.
+ Experience with containerization technologies such as EC2, EKS, Fargate, Openshift, or Kubernetes.
+ Understanding of cost management, inventory management, FinOps model
Bank of America and its affiliates consider for employment and hire qualified candidates without regard to race, religious creed, religion, color, sex, sexual orientation, genetic information, gender, gender identity, gender expression, age, national origin, ancestry, citizenship, protected veteran or disability status or any factor prohibited by law, and as such affirms in policy and practice to support and promote the concept of equal employment opportunity, in accordance with all applicable federal, state, provincial and municipal laws. The company also prohibits discrimination on other bases such as medical condition, marital status or any other factor that is irrelevant to the performance of our teammates.
To view the "Know your Rights" poster, CLICK HERE ( .
View the LA County Fair Chance Ordinance ( .
Bank of America aims to create a workplace free from the dangers and resulting consequences of illegal and illicit drug use and alcohol abuse. Our Drug-Free Workplace and Alcohol Policy ("Policy") establishes requirements to prevent the presence or use of illegal or illicit drugs or unauthorized alcohol on Bank of America premises and to provide a safe work environment.
To view Bank of America's Drug-free Workplace and Alcohol Policy, CLICK HERE .
Bank of America is committed to an in-office culture with specific requirements for office-based attendance and which allows for an appropriate level of flexibility for our teammates and businesses based on role-specific considerations. Should you be offered a role with Bank of America, your hiring manager will provide you with information on the in-office expectations associated with your role. These expectations are subject to change at any time and at the sole discretion of the Company. To the extent you have a disability or sincerely held religious belief for which you believe you need a reasonable accommodation from this requirement, you must seek an accommodation through the Bank's required accommodation request process before your first day of work.
This communication provides information about certain Bank of America benefits. Receipt of this document does not automatically entitle you to benefits offered by Bank of America. Every effort has been made to ensure the accuracy of this communication. However, if there are discrepancies between this communication and the official plan documents, the plan documents will always govern. Bank of America retains the discretion to interpret the terms or language used in any of its communications according to the provisions contained in the plan documents. Bank of America also reserves the right to amend or terminate any benefit plan in its sole discretion at any time for any reason.