Remote jobs in Programming

WORK ANYWHERE!

Booming Games Malta Ltd.

Site Reliability Engineer

Worldwide

kubernetes

containers

cni

ceph

provisioning

1 month

Remote Jobs

>

Remote Jobs in Worldwide

>

Site Reliability Engineer

Location: Type: Full-time

We are growing and our Operations Department is  looking for support to join our international team!


Responsibilities



  • Daily interactions ensuring the health and maintenance of systems in different geographical locations: hardware, software, application and network are operating at peak performance

  • Perform deep dives into both systemic and latent reliability issues; partner with software and systems engineers across the organization to produce and roll out fixes

  • Troubleshoot issues across the entire stack: hardware, software, application and network

  • Drive standardization efforts across multiple disciplines and services in conjunction with SREs throughout the organization

  • Identify and drive opportunities to improve automation for the company; scope and create automation for deployment, management and visibility of our services

  • Represent the SRE organization in design reviews and operational readiness exercises for new and existing services

  • Work with software engineers to improve upon deployment processes

  • Participate in the on-call rotation for production systems


Requirements



  • Sound fundamentals in operating systems, networking, and distributed systems

  • Strong familiarity with Linux systems administration and management best practices

  • Familiarity with container technologies: Kubernetes, CRI, Docker, namespaces, cgroups

  • Strong understanding of: Ethernet, VLANs, IPv4/IPv6, ARP, DHCP, DNS, and TCP

  • Familiarity with distributed system problems: leader election, Raft consensus, etc.

  • Solid understanding of systems and application design, including the operational trade-offs of various designs

  • Expert level understanding with at least one public or private cloud technology such as Amazon AWS, Google GKE, or OpenStack

  • Practical knowledge of various aspects of service design, including messaging protocols and behavior, caching strategies and software design practices   

  • Practical intermediate knowledge of shell scripting, some Ruby is a plus

  • Demonstrable knowledge of TCP/IP, HTTP, web application security, and experience supporting multi-tier web application architectures

  • Excellent knowledge of Linux/UNIX systems administration and performance tuning

  • Comfortable configuring DNS, DHCP, and LAN/WAN technologies

  • Minimum 5 years of managing services in an internet scale *nix environment

  • Must be able to communicate well with technical as well as non-technical colleagues to achieve business goals

  • Must be adaptable and able to focus on the simplest, most efficient and reliable solutions

  • Track record of successful practical problem solving, excellent written and interpersonal communication in English, and documentation skills

  • Curiosity and an interest in networking, systems software, and distributed systems

  • Experience as a systems administrator or operations engineer

  • Experience with a 24/7 production environment

  • Experience with managed deployments providing software, platforms, or infrastructure as a service

  • Experience with Mellanox and Vyatta based networking gear is a plus

  • Experience with SuperMicro server and storage gear is a plus

Your DREAM REMOTE JOB inside your inbox!

Get a
email of all new remote
Jobs.

Cookies, terms, and privacy policy

By clicking or navigating this website you accept and allow all our cookies, terms of use and privacy policy. This site uses cookies to offer you a better browsing experience.

UNDERSTOOD
feedbackfeedback

How would you rate your experience?

Experince

We may wish to follow up. Enter your email if you're happy for us to contact you.