I just came across this year-old tweet by my former Udacity colleague, Mat Leonard.
When I read that, I thought, that’s ironic, because so much of how we taught machine learning at Udacity focused on modeling.
And sure, enough, the next tweet in the thread:
I think a lot of labeling and cleaning is outsourced, either to specialty companies, or to specialty teams within a larger ML organization. Perhaps there’s an opportunity for ML engineer to learn more about data labeling and cleaning.
Mat, by the way, now leads the education team at OpenMined.
I think it’s safe to say that the emphasis on models (in the ML “lore” writ large) contributes to a number of hiring managers’ inflated expectations and lack of appreciation for the extremely outsized role that data plays in the process. I speak from experience.
LikeLike