We help accelerate the delivery of data-driven products and services by providing expertise, standards, and solutions that help us all conduct R&D in an optimal way, while preserving high quality and trust in the produced results.
We mainly work on Amazon Web Services (EC2, Athena, Glue, EMR, SageMaker, Step functions, etc.), and for some projects we use Google Cloud equivalents
We use Databricks as our platform for data science and ML engineering
We use Apache Spark for distributed data processing and computation
We use Apache Airflow for workflow automation
Our main programming language is Python, along with the most common libraries and frameworks for ML engineering (eg. TensorFlow, Keras)
We also work a lot with debuggers, profilers, and analyzers