Data, Deployment, And Modeling

I just came across this year-old tweet by my former Udacity colleague, Mat Leonard.

When I read that, I thought, that’s ironic, because so much of how we taught machine learning at Udacity focused on modeling.

And sure, enough, the next tweet in the thread:

I think a lot of labeling and cleaning is outsourced, either to specialty companies, or to specialty teams within a larger ML organization. Perhaps there’s an opportunity for ML engineer to learn more about data labeling and cleaning.

Mat, by the way, now leads the education team at OpenMined.

  1. I think it’s safe to say that the emphasis on models (in the ML “lore” writ large) contributes to a number of hiring managers’ inflated expectations and lack of appreciation for the extremely outsized role that data plays in the process. I speak from experience.


