Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Just curious, do you have direct experience with big companies using R, python, etc in production? My sense from working with people from those companies (and a few internships at those companies) is that you could use something like R, matlab, or scikit-learn on your own workstation with a tiny data sample to explore the data, but then do crunching by translating that program into some Java or C++ code (sometimes using specialist libraries) and running that in parallel, for production.

Do people actually just skip that step and just directly deploy their R? That seems really scary.



Having worked with several large (fortune 500) data science teams, they generally do their development in Python and then throw their models over the fence for us to productionize with Java.

The major difference I've seen between most of these companies is whether they've embraced Java 8 yet.


This is my experience as well. The data science people, who need to do proofs of concept, use Python. The people who have to write large, complicated codebases that scale well and accept years of modification use statically-typed languages like Java and C#.


To be honest, until recently I'd say that everything would be rewritten in java - mostly since there were already deployment pipes for it.

With the latest container-based toolchains I think it's not a bad idea to deploy some R-based services for certain type of tasks.


With the work Microsoft have been doing with the former Revolution Analytics, you absolutely can go into Production on R if you want to.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: