Site Reliability Engineer
Full-time
About us.
We are PremFina, a leader in the premium finance industry, where innovation, advanced technology, and strategic expertise merge to redefine the standards of our sector. As a Platform Engineer at PremFina, you will play a crucial role in our journey to transform the insurance industry. With substantial support from our investors, we are at the forefront of providing innovative financing solutions and a premium white-label cloud-based Software-as-a-Service (SaaS) to insurance brokers and companies across the globe.
Our main objective is to revolutionize insurance payment systems, making them more accessible and affordable. Our dedicated team of around 150 professionals is united by a common vision: to support our clients, strengthen our partnerships, and drive innovation in the insurance world. During our rapid growth, we remain committed to technological innovation, diversity, and transformative solutions. At PremFina, you are joining more than just a company; you are becoming part of a community that challenges the norm and envisions a more efficient future. Dive into a collaborative and forward-thinking environment and play a key role in shaping the future of insurance with us.
About the role.
As a Site Reliability Engineer at PremFina, your expertise will be vital in ensuring the reliability, scalability, and security of our platform. You'll work closely with cross-functional teams to design, deploy, and maintain our cloud-based infrastructure, addressing system bottlenecks, automating processes, and resolving complex technical challenges. Your role is key in driving the continuous improvement of our platform, ensuring it meets the highest standards of performance and resilience.
PremFina offers a dynamic environment where you’ll tackle challenging projects that will push your technical skills and foster your professional growth. Your contributions will directly impact the stability and effectiveness of our platform services, delivering a superior experience to our clients. We are committed to nurturing a culture of continuous learning, providing you with opportunities to expand your expertise and stay ahead in a rapidly evolving industry.
Join PremFina, where you'll find a supportive work culture, competitive compensation, and clear career progression paths. As a Site Reliability Engineer, you will be an integral part of our mission to redefine the insurance industry and deliver exceptional platform services.
Desirable experience.
- Bachelor’s degree in computer science, engineering, or a related field, or equivalent practical experience.
- Experience in a managed services, or software as a service provider.
- Experience in platform engineering, cloud services, and system optimization.
- Deep understanding of cloud computing, containerization, and orchestration tools.
- Proficiency in one or more programming or scripting languages.
- Experience with infrastructure as code (IaC) and automated deployment tools.
- Strong analytical and problem-solving skills, with a keen attention to detail.
- Excellent communication skills, enabling effective collaboration with cross-functional teams.
- Proactive attitude and a willingness to learn and adapt in a fast-paced environment.
- Exposure to Azure PaaS and IaaS environments.
Role accountabilities and behaviours.
Autonomy.
Works under general direction. Receives specific direction, accepts guidance and has work reviewed at agreed milestones. Uses discretion in identifying and responding to complex issues related to own assignments. Determines when issues should be escalated to a higher level. Plans and monitors own work (and that of others where applicable) competently within limited deadlines.
Influence.
Interacts with and influences colleagues. May oversee others or make decisions which impact routine work assigned to individuals or stages of projects. Has working level contact with customers, suppliers and partners. Understands and collaborates on the analysis of user/customer needs and represents this in their work. Contributes fully to the work of teams by appreciating how own role relates to other roles.
Complexity.
Performs a range of work, sometimes complex and non-routine, in a variety of environments. Applies a methodical approach to routine and moderately complex issue definition and resolution. Applies and contributes to creative thinking or finds new ways to complete tasks.
Business skills.
- Demonstrates effective oral and written communication skills when engaging on issues with colleagues, users/customers, suppliers and partners.
- Understands and effectively applies appropriate methods, tools, applications and processes.
- Demonstrates judgement and a systematic approach to work.
- Effectively applies digital skills and explores these capabilities for their role.
- Learning and professional development — takes the initiative to develop own knowledge and skills by identifying and negotiating appropriate development opportunities.
- Security, privacy and ethics — demonstrates appropriate working practices and knowledge in non-routine work. Appreciates how own role and others support appropriate working practices.
Knowledge.
Has sound generic, domain and specialist knowledge necessary to perform effectively in the organisation typically gained from recognised bodies of knowledge and organisational information. Has an appreciation of the wider business context. Demonstrates effective application and the ability to impart knowledge found in industry bodies of knowledge. Absorbs new information and applies it effectively.
Role-specific competencies.
Programming/software development.
- Designs, codes, verifies, tests, documents, amends and refactors moderately complex programs/scripts.
- Applies agreed standards and tools to achieve a well-engineered result.
- Monitors and reports on progress. Identifies issues related to software development activities. Proposes practical solutions to resolve issues.
- Collaborates in reviews of work with others as appropriate.
Incident Management.
- Ensures that incidents are handled according to agreed procedures.
- Prioritises and diagnoses incidents. Investigates causes of incidents and seeks resolution. Escalates unresolved incidents.
- Facilitates recovery, following resolution of incidents. Documents and closes resolved incidents.
- Contributes to testing and improving incident management procedures
Problem Management.
- Initiates and monitors actions to investigate and resolve problems in systems, processes and services.
- Determines problem fixes and remedies.
- Collaborates with others to implemented agreed remedies and preventative measures.
- Supports analysis of patterns and trends to improve problem management processes.
Availability management.
- Analyses service and component availability, reliability, maintainability and serviceability.
- Contributes to the availability management process and its operation. Performs defined availability management tasks.
- Ensures that services and components meet and continue to meet all of their agreed performance targets and service levels.
- Implements arrangements for disaster recovery and documents recovery procedures. Conducts testing of recovery procedures.
Service level management
- Performs defined tasks to monitor service delivery against service level agreements and maintains records of relevant information.
- Analyses service delivery performance to identify actions required to maintain or improve levels of service.
- Initiates and reports on actions to maintain or improve levels of service.