Position Summary:
We are seeking a highly experienced Senior Staff Engineer with 12–15+ years of expertise to lead the design and development of self-serve platforms focused on real-time ML deployment and advanced data engineering workflows. This senior role requires expertise in cloud-native platform engineering, scalable infrastructure development, and end-to-end deployment of machine learning solutions to support business-critical applications.
Responsibilities:
- Lead the design and development of scalable microservices-based platforms using Kubernetes, Docker, and Python (FastAPI) to manage ML workflows and data pipelines.
- Architect and manage real-time ML inference platforms with tools like AWS SageMaker and Databricks, ensuring model versioning, monitoring, and lifecycle management.
- Build and optimize large-scale ETL/ELT pipelines using PySpark, Pandas, and manage feature stores for consistent, high-quality data.
- Design and implement distributed data pipelines integrating tools like DynamoDB, PostgreSQL, MongoDB, and MariaDB.
- Oversee the creation of data lakes and data warehouses to support advanced analytics and ML workflows.
- Architect and maintain robust CI/CD pipelines with tools such as Jenkins and GitHub Actions for automated testing and deployment.
- Drive automation of data validation and monitoring for ensuring data consistency and quality.
- Collaborate with cross-functional teams to understand business needs and translate them into technical solutions.
- Provide thought leadership, mentor junior engineers, and contribute to technical decision-making across the organization.
- Document and maintain comprehensive technical artifacts for architecture and workflows.
Required Skills & Qualifications:
- 12–15+ years of experience in platform engineering, data engineering, or DevOps roles with a focus on machine learning.
- Advanced proficiency in Python, PySpark, FastAPI, Kubernetes, and Docker.
- Deep expertise in AWS services, including SageMaker, Lambda, DynamoDB, EC2, and S3.
- Extensive experience designing and optimizing distributed data pipelines with Databricks, PostgreSQL, and other database solutions.
- Proven experience building and managing CI/CD pipelines with tools like Jenkins, GitHub Actions, and troubleshooting with tools like New Relic.
- Strong leadership skills with a track record of mentoring and guiding engineering teams.
- Exceptional communication skills, with the ability to explain complex technical solutions to stakeholders.