DevOps Engineer

We’re always looking for new talent. Find the career you deserve. Apply now as a DevOps Engineer!

Engineering FlexiPlace

JOB SUMMARY

Full-time mid to senior DevOps Engineer / Site Reliability Engineer (SRE), who is comfortable with
applying modern “dev” principles to sysadmin practices: whole of lifecycle from design to release
and production support to ongoing development. We’re a dynamic engineering technology
company with plenty of opportunities to take the initiative in a highly technical environment.

The role is multifaceted: around one-third to one-half of your time will involve working with our
software, hardware, and test engineering teams to help them deliver our awesome products; the
balance will be more systems engineering SRE work in delivering the infrastructure that powers
company’s services and improving the flexibility and performance of our production environments.

RESPONSIBILITIES

  • Write a Python program to poll a network device for health metrics to be ingested to the
    monitoring and alerting system.
  • Create a file sharing service so internal teams can securely supply sensitive data to partners.
  • Troubleshoot layer two networking issues on trunk and access ports for the Software Engineering
    team’s testing rack.
  • Re-engineer our CI/CD system’s implementation including configuration management to deliver a more reliable, scalable, secure solution.
  • Design and build a company-wide nearline storage and backup system based on ZFS.
  • Provide guidance and consultation with engineering on improvements to build processes
    including Dockerisation and artefact management.
  • Deliver an authenticated application edge router (reverse proxy) solution, enabling authentication
    for legacy services and future integration into an SSO identity service.

QUALIFICATIONS

  • Linux or Windows systems and higher layer services expertise (OS through to web and database services) 
  • Containerisation, container management and orchestration: Docker, Kubernetes, and LXC 
  • Provisioning and configuration management: Kickstart, Ansible, Ansible AWX/Tower, Puppet, Chef 
  • Solid networking fundamentals, design and troubleshooting knowledge (to the level of CCNA / CompTIA Network+, certification nice but by no means required) 
  • Proven troubleshooting and fault analysis skills 
  • Graceful under fire – able to stay calm and focused under pressure 
  • Interpersonal skills (working with developers, engineers, business and operations & partners) 
  • Ability to work independently and as part of a team 
  • Driven, with grit, and a let’s-get-it-done-right attitude 
  • Agile: flexible, open to change and able to preach the “continuous improvement” message 

DESIRABLE SKILLS

  • Version control and repository management: Git, Artifactory 
  • Monitoring and alerting systems: Prometheus, Grafana 
  • Traffic management and content distribution systems: Traefik/Envoy/HAProxy/etc, Istio, Cloudflare 
  • CI/CD systems: Jenkins or Bitbucket Pipelines, or Travis CI, Circle CI, etc 
  • Scripting in one or more of: Python, Javascript, PHP; Go; shell/PowerShell (If you have a personal code repo or open source project pull requests, make sure to let us know in your application) 
  • Web stack: Nginx or Apache, MongoDB, NodeJS, Meteor