Job description & requirements
We’re looking for experienced Site Reliability Engineers to join our team. We’re growing fast globally with increasing traffic and load, while also iterating and executing on many product and platform improvements. As part of our SRE team, you would own and continue to build, automate and scale our platform infrastructure ensuring it runs smoothly, efficiently and reliably. You would also develop solutions for continuous monitoring and incorporate stress testing to improve our system performance and uptime. You will work alongside our software engineers to support them in CI/CD.
You will have ample opportunities to experiment and go in-depth with new technologies, gain deeper experience in managing production work loads and more.
- You ideally have 3 - 5+ years of development experience and extensive operations experience
- Experience managing high traffic/load services or platforms
- Familiarity with AWS, including VPC, EC2, ELB, and RDS, Virtual Networking Topologies, Chef, PowerShell/DSC, BASH and more
- Experience with configuration management and provisioning tools, such as Chef, Puppet, SALT, Vagrant, etc.
- Experience using container technologies (Docker, Kubernetes)
- Knowledge of most of the following: monitoring, alerts, logging, build/deploy, service discovery, load detection, scaling, self-healing, auto-testing, cloud security
- Experience with intrusion prevention/detection (IPS/IDS) processes
- Experience with logging infrastructure and tools such as Elasticsearch, Kibana, Graphite and Splunk
- Knowledge of network theory, such as TCP/IP, UDP, ICMP, MAC addresses, IP packets, DNS, OSI layers, and load balancing
What’s it like working at PigeonLab Pte Ltd?
Flat hierarchy, open plan office, stocked pantry, flexible working hours. Smart, experienced and capable team. Everyone has equal opportunities and am valued. Fast paced, output driven, and emphasis on user experience.