As a Data Engineer at Jungle you will operate and further develop our computational backbone. To help our data science teams tackle challenges in solving industry-changing problems, we have been developing a distributed computing infrastructure especially tailored at time-series analysis.
It makes crunching multiple TBs of sensor data, performing feature extraction and training models on it, feel like running a scikit-learn algorithm on a 100MB dataset. Although we don't expect you to have hands-on experience in Machine Learning, you will inevitably learn how it works, and what is necessary to apply such technology to predict the future state of a 1500+ sensor factory, for example.
From day-to-day, you will be working in distributed computing, distributed file storage, data version control, AWS&Azure, containers & orchestration (e.g. Kubernetes), stateless systems, etc. You will spend time understanding the needs of the engineers within our team and lead the effort of further developing the infrastructure needed to help them do their work better and faster.
As a last note; what we're building here, does not exist yet. Meaning you will have to do research, try, test, and solving loads of new technical challenges!
- We enjoy working on big problems with big impact. In our focus areas they go hand in hand with large amounts of historical and streaming data. We need you to help us further develop our in-house infrastructure to handle these challenges.
- We are building technology to remove as much manual work from our activities as possible. By contributing to our tools you will enable Jungle engineers to work more efficiently.
- We rapidly roll out existing products over different clients within an industry. You will speed up the process by automating the most time-consuming parts of these trajectories.
- Our clients depend on our predictions to be highly-available. You will build systems that can guarantee this uptime.
- As machine learning is becoming a common term, we need you to help the world dream about the possibilities of this technology whilst remaining humble and down to earth.
- You dream in shell and you're fluent in performant Python
- You understand state management and stateless systems,
- You have demonstrable work with concurrent/distributed systems
- You can comfortably articulate data flow considerations to other engineers
- You have experience with container technology and orchestration
- You are very curious and won't stop searching until you find the answer
- You are fluent in English, and know how to order a pastel de nata in Portuguese (bonus points for other pastries)
- You're curious about machine learning, the technology and its applications