Research Engineers

We help accelerate the delivery of data-driven products and services by providing expertise, standards, and solutions that help us all conduct R&D in an optimal way, while preserving high quality and trust in the produced results.

Responsibilities

  • We implement, evaluate, and improve prototype statistical models and machine learning algorithms (classic and deep neural networks)
  • We design, implement, operationalise, and monitor production-level machine learning algorithms as a service
  • We proactively evaluate promising technologies (e.g. frameworks, libraries) worth adapting in our workflow
  • We serve as a bridge between our AI team and other engineering teams

Tools

  • We mainly work on Amazon Web Services (EC2, Athena, Glue, EMR, SageMaker, Step functions, etc.), and for some projects we use Google Cloud equivalents

  • We use Databricks as our platform for data science and ML engineering

  • We use Apache Spark for distributed data processing and computation

  • We use Apache Airflow for workflow automation

  • Our main programming language is Python, along with the most common libraries and frameworks for ML engineering (eg. TensorFlow, Keras)

  • We also work a lot with debuggers, profilers, and analyzers

Example projects

Introduce software engineering standards and best practices

  • Involve code reviews and pair coding into the dev process

  • Adjust branching strategy to the needs

  • Provide QA techniques to improve code quality

  • Integrate project to CI environments

  • Show the beauty of software craftsmanship

R&D process improvements and tools

  • Create an abstraction layer for tracking experiments

  • Provide tooling and know-how for distributed training / distributed data processing

  • Automate workflows

Deliver proof of concepts

  • Compare Databricks with ElasticMapReduce

  • Explore dockerization for model evaluation

  • Try out new solutions - the right tool for the right job