Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Totally agree, everybody in the industry knows about it. However if you look at https://www.google.com/trends/explore?date=2014-01-01%202017... nobody outside seems to know. I might be wrong but a lot of people outside the ML community seem to be hesitant to using ml because they don't have enough data, trying to remove the misconception if it exists


Google Trends can tell you a lot of things :)

https://www.google.com/trends/explore?date=2014-01-01%202017...

More seriously though: others have pointed out that finetuning is pretty popular in some subfields, but it's just one hammer in a of a whole toolbox of techniques which are necessary to make neural nets train (even when you have a tonne of data). Standardisation, choice of initialisation, and choice of learning rate schedule all come to mind as other factors which seem simple, but which can have a huge impact in practice.

Of course, each tool has its limitations. The most obvious limitation of finetuning is that you need a network that's already been trained on vaguely similar data. Pretraining on ImageNet is probably not going to help you solve problems where the size of objects matters, for example, because most ImageNet performance tends to benefit from scale invariance.

I wish you luck with nanonets.ai, but I think it's irresponsible to market this as the "1 weird trick" to bring data efficiency to neural nets.


That graph might be more because it's not as exciting. Google searches are probably dominated by hobbyists and casually interested people. If they're not trying to achieve a specific goal, then they might prefer to work out the basics and make it themselves instead of just taking part of someone else's work and reusing it. If you were going to do that for fun, why not go the whole hog and reuse an entire pretrained network?

Personally, I'm a hobbyist and I don't want to know about these shortcuts until I start to need them - which is a stage I might never reach. People who've progressed far enough to need them are probably far fewer than those who are just curious what these words mean.

Another possibility is the words "transfer learning" might be more generally meaningful outside the ML field than the other search terms on the graph, so most of the searches for it are really schoolteachers or something else.


Spot on. If you take that argument a step further that means an average developer who is not a data scientist or ml researcher might not know about it. Which implies a super easy to use dead simple technique which is used by most researchers is not available to the common developer even though it is easy enough for them to use.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: