Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It handles millions, but it can be a lot faster to just pipe output from tar through the ssh connection.


Got any benchmarks/write ups on the subject? I did a bit of testing myself a long time ago and basically the answer just ended up being to use rsync because any differences were marginal. That said, I didn't test with millions of files.


I think perhaps this was a bigger issue back in the day when we were using rotating harddisks. In those day doing a seek would be a lot slower than doing a write.

Today seeks are mostly instants, so maybe my experience isn't valid anymore.


Your experience is valid today but for a different reason: if you're comparing millions of tiny files there's a lot of back and forth. If you're streaming a single archive, it only checks if that single file has been modified.

Like everyone here I've no benchmarks but have got burned trying to rsync around too many small files.


I got into the « lot of small files » case, and I ended up generating a file list, split it and feed multiple rsync instances with xargs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: