Job information

$ 20,000 - 40,000 HKD per month

Job Description

AfterShip Limited might be able to provide a visa.

Description

 

We're looking for a DevOps Site Reliability Engineer (SRE) who loves coding and interested in working closely with engineers to help design, build and maintain high-performing, scalable, testable and reliable services.

We believe SREs play a crucial part in providing engineers with tools, best practices and expertise to help them be responsible for the software they build.


We offer:

- Fun startup culture with diverse and inclusive international teams
- Happy Friday sharings and happy-hour gatherings
- Flexible working hours and unlimited work-from-home
- 5-day work week and 15 days annual leave
- Benefits include medical / dental coverage and performance bonus
- Employment visa sponsorship (if required)


FYI, our tech stacks:

- Cloud:  Amazon Web Services, Google Cloud Platform 
- Database:  MongoDB, Google Spanner, Redis, DynamoDB, Amazon Aurora 
- Monitoring: New Relic, Pingdom 
- Automation: Ansible, Packer 
- Continuous Integration:  Jenkins, Travis CI 
- Tools:  Atlassian Suite
- Other SaaS:  Algolia, Mixpanel, SendGrid, Twilio, Contentful, Google Analytics
- Deployment: Kubernetes, Docker
- DNS: CloudFlare


Responsibilities:

- Ensuring continuous service delivery with high reliability and operation performance to meet SLAs
- Managing optimisation of SaaS and infrastructure usage and cost
- Collaborating with engineering teams to perform troubleshooting to investigate and respond to infrastructure outages
- Designing and implementing automated architecture solutions to minimise related workload for product engineering teams


Requirements:

- 2+ years hands-on experience with Amazon Web Services or Google Cloud Platform
- Experience with DNS, load balancing, failover strategies, Blue-Green and Canary deployments
- Ability to setup automated monitoring and alerting systems
- Dedication towards up-time and service-level objectives
- Excellent password hygiene and sense of security awareness 
- Some experience with log aggregation, analysis and troubleshooting
- Ability to communicate with multiple teams and IaaS / SaaS vendors effectively