/Machine Learning Engineer – MLOps Lead

Machine Learning Engineer – MLOps Lead

New Jersey, NJRemoteusvia direct
// Job Type
Full Time
// Salary
Not disclosed
// Posted
4 months ago
// Work Mode
remote

About the Role

<span><span><b>Job Title: Machine Learning Engineer – MLOps Lead<br /> Duration: Contract role<br /> Location: Remote, United States</b></span></span><br /> <br /> <span><span><b>Role Mission</b></span></span><br /> <span><span>You are being hired to productionize machine learning at scale — eliminating fragile pilot models, building hardened MLOps pipelines, and delivering compliant, monitored, and continuously improving ML systems that directly support business operations.</span></span><br /> <span><span>Your success is measured not by “knowing tools,” but by deploying, stabilizing, and scaling real ML systems in production.</span></span><br /> <br /> <span><span><b>First-Year Outcomes (What You Must Deliver)</b></span></span><br /> <span><span><b>Within First 30 Days</b></span></span> <ul> <li class="MsoNoSpacing"><span><span>Fully assess current ML pipelines, data flows, and deployment architecture</span></span></li> <li class="MsoNoSpacing"><span><span>Identify top 3 reliability, security, and performance risks in current ML lifecycle</span></span></li> <li class="MsoNoSpacing"><span><span>Produce a documented MLOps modernization roadmap</span></span></li> </ul> <br /> <span><span><b>Within 90 Days</b></span></span><br /> <span><span><b>You will:</b></span></span> <ul> <li class="MsoNoSpacing"><span><span>Stand up standardized CI/CD pipelines for model training, validation, and deployment</span></span></li> <li class="MsoNoSpacing"><span><span>Implement automated monitoring, alerting, and versioning across active production models</span></span></li> <li class="MsoNoSpacing"><span><span>Deploy at least one business-critical ML model into hardened production pipelines</span></span></li> <li class="MsoNoSpacing"><span><span>Establish security, audit, and compliance controls for model governance</span></span></li> <li class="MsoNoSpacing"><span><span>Reduce model deployment cycle time by 30–50%</span></span></li> </ul> <br /> <span><span><b>Within 180 Days</b></span></span><br /> <span><span><b>You will:</b></span></span> <ul> <li class="MsoNoSpacing"><span><span>Operate a fully standardized enterprise MLOps framework (MLflow/Kubeflow/Airflow based)</span></span></li> <li class="MsoNoSpacing"><span><span>Enable continuous retraining and automated rollback capability</span></span></li> <li class="MsoNoSpacing"><span><span>Achieve ≥ 99.5% model uptime</span></span></li> <li class="MsoNoSpacing"><span><span>Establish retraining cadence that improves model accuracy and reliability quarter-over-quarter</span></span></li> <li class="MsoNoSpacing"><span><span>Mentor junior engineers and codify ML engineering standards</span></span></li> </ul> <br /> <span><span><b>Ongoing Success Metrics</b></span></span> <table class="Table"> <thead> <tr> <td><span><span><b>Metric</b></span></span></td> <td><span><span><b>Target</b></span></span></td> </tr> </thead> <tbody> <tr> <td> <ul> <li class="MsoNoSpacing"><span><span>Production model uptime</span></span></li> </ul> </td> <td> <ul> <li class="MsoNoSpacing"><span><span>≥ 99.5%</span></span></li> </ul> </td> </tr> <tr> <td> <ul> <li class="MsoNoSpacing"><span><span>Model deployment cycle time</span></span></li> </ul> </td> <td> <ul> <li class="MsoNoSpacing"><span><span>↓ 30–50%</span></span></li> </ul> </td> </tr> <tr> <td> <ul> <li class="MsoNoSpacing"><span><span>Automated pipeline coverage</span></span></li> </ul> </td> <td> <ul> <li class="MsoNoSpacing"><span><span>100%</span></span></li> </ul> </td> </tr> <tr> <td> <ul> <li class="MsoNoSpacing"><span><span>Compliance audit readiness</span></span></li> </ul> </td> <td> <ul> <li class="MsoNoSpacing"><span><span>Continuous</span></span></li> </ul> </td> </tr> <tr> <td> <ul> <li class="MsoNoSpacing"><span><span>Model accuracy improvement</span></span></li> </ul> </td> <td> <ul> <li class="MsoNoSpacing"><span><span>QoQ measurable gains</span></span></li> </ul> </td> </tr> </tbody> </table> <br /> <span><span><b>What You Will Build</b></span></span> <ul> <li class="MsoNoSpacing"><span><span>End-to-end MLOps pipelines (data → training → testing → deployment → monitoring → retraining)</span></span></li> <li class="MsoNoSpacing"><span><span>Kubernetes-based model serving platforms</span></span></li> <li class="MsoNoSpacing"><span><span>Cloud ML platforms (Vertex AI / SageMaker / Azure ML)</span></span></li> <li class="MsoNoSpacing"><span><span>CI/CD automation for ML systems</span></span></li> <li class="MsoNoSpacing"><span><span>Model observability and alerting using Prometheus / Grafana</span></span></li> <li class="MsoNoSpacing"><span><span>Secure, version-controlled ML governance frameworks</span></span></li> </ul> <br /> <span><span><b>Required Experience (Performance Evidence)</b></span></span><br /> <span><span><b>You must have:</b></span></span> <ul> <li class="MsoNoSpacing"><span><span>Proven delivery of production ML pipelines (not just experiments)</span></span></li> <li class="MsoNoSpacing"><span><span>Built CI/CD for ML models in Kubernetes environments</span></span></li> <li class="MsoNoSpacing"><span><span>Implemented monitoring, retraining, and version governance</span></span></li> <li class="MsoNoSpacing"><span><span>Delivered at least one enterprise-scale ML deployment</span></span></li> <li class="MsoNoSpacing"><span><span>Hands-on experience with MLflow / Kubeflow / Airflow</span></span></li> <li class="MsoNoSpacing"><span><span>Cloud ML production deployment (AWS, GCP, or Azure)</span></span></li> <li class="MsoNoSpacing"><span><span>Strong Python engineering background</span></span></li> </ul>

Interested in this job?

Login to Apply

Use our AI to tailor your resume for this Machine Learning Engineer – MLOps Lead position at CoSourcing Partners.