Kuala Lumpur, Federal Territory of Kuala Lumpur, MY, 50470
ML Ops Data Engineer
The role will be responsible to bridge the gap between data science and IT operations, enabling seamless model lifecycle management and scalable ML infrastructure. Apart from that, the role is also responsible for designing, developing and optimizing data pipelines, ensuring reliable and efficient data workflows across the organization for structure and unstructured data related to Boost’s use cases and mainly to support Boost’s to incorporate unified data modelling across AI, Machine learning, and Analytics projects. This role will collaborate with cross-functional teams, including data scientists, analysts, software engineers and DevOps to optimize the production and deployment of machine learning solutions, and support and enhance data-driven decision-making and analytics.
SCOPE & AUTHORITY
- Develop and automate deployment pipelines for machine learning models, ensuring smooth transition from development to production.
- Implement monitoring systems to track model performance and stability, and address model drift or data drift issues.
- Design, build, manage and maintain end to end model lifecycle process which includes but not limited to – pipelines specific to machine learning models, model CI/CD, model performance evaluation and monitoring, model experiments, model observability, model deployments, versioning and serving.
- Manage and optimize ML infrastructure and cloud platforms (e.g., AWS, GCP, Azure, Databricks, Snowflake).
- Work with data scientists and DevOps to support model scalability, reliability, and reproducibility in production.
- Maintain documentation for model pipelines, infrastructure, and deployment procedures to ensure consistency and compliance.
- Design, develop, manage and maintain scalable ETL/ELT data pipelines to support ingestion, transformation, and integration of data from multiple sources.
- Design and implement robust data architectures, data models, and storage solutions to ensure efficient data processing and accessibility.
- Optimize and manage data warehouses, ensuring high availability, reliability, and performance. Implement data quality checks, monitoring, and governance protocols to maintain the accuracy, consistency, and integrity of data including access management of the data.
- Identify and address performance bottlenecks in implemented solutions.
- Maintain thorough documentation of data pipelines, catalogs, workflows and lineage for transparency and reproducibility.
- Support the ongoing Centralized Data Platform initiative between Boost entities, aiming to create a single view of Boost users and merchants.
- Automate data pipelines to reduce manual effort and improve overall efficiency.
- Conform to the best practices when designing or implementing a solution.
QUALIFICATIONS
- Bachelor’s Degree in Computer Science, Data Engineering, Machine Learning or a related field with a minimum of 3+ years of experience in MLOps/Data Engineer - designing, developing and maintaining huge Data Warehouse and analytics projects.
- Strong problem-solving skills, collaboration skills, adaptability to evolving technology, commit to process improvement, attention to detail and the ability to communicate technical concepts effectively to non-technical stakeholders.
- Strong knowledge on cloud platforms for data solutions (AWS/Azure/GCP/Databricks/Snowflake). Strong knowledge in ETL/ELT tools (e.g., Apache Airflow, AWS Glue, Databricks Jobs/Pipelines). Proficiency in data modeling and schema design.
- Proficiency in programming languages such as Python or Bash scripting.
- Proficiency in SQL and data warehouse (e.g., Redshift, BigQuery, Databricks, Snowflake or similar).
- Familiarity with ML/MLOps frameworks (e.g., mlflow, TensorFlow, PyTorch, scikit-learn).
- Familiarity with data governance frameworks and best practices.
- Familiarity with data lake architectures.
- Familiarity with big data processing frameworks (e.g., Apache Spark, Hadoop).
- Familiarity with Infrastructure as Code - IaC (e.g., Terraform, Terragrunt, Serverless framework).
- Experience with CI/CD tools (e.g., Jenkins, GitLab CI/CD).