Tuesday, March 15, 2005

So you want to be a consultant

I saw So you want to be a consultant? long ago. I'm looking at it again since someone on a mailing list pointed at it.

I'm learning again. Some of those lessons I haven't internalized yet. It'll take a while, a lot of things do. I'll get there yet :).

Friday, March 11, 2005

rsync and compressed files

I did some testing and I find that generally, it is better to rsync uncompressed files rather than the corresponding compressed files or archives. at any rate, tar.gz archives are bad for rsync. tar files are OK.


  1. i took a directory of source code and test data, around 9MB.

  2. copied it to a remote box

  3. tar cvzf on both sides to one file and also tar cvf to another file.

  4. on the source box, edit one source file, insert only one line.

  5. tar cvzf and tar cvf on the source box. the source box should have sources, tar and .tgz which vary in only one line in only one internal file.

  6. rsync of the source gives a speedup of 450 (14K sent, 94 received), rsync of the tar file gives a speedup of 85000+ (78 bytes received, 20 bytes sent), rsync of the .tgz gives a speedup of 1.48, (2.4MB sent, 12K or so received).

so rsync of a tar file is best (because only one file needs to be analyzed to see where the differences are). rsync of a compressed file (at any rate of .tgz, but probably of any compressor) is bad. not sure why, but i wouldn't be surprised if the compressed representation of a lot of data depends on what has come before, and there may be other effects like that which
confound the difference finder since too much is found to be different.

Friday, March 04, 2005

Corporate peer-to-peer

is almost always a mistake. at any rate, it is anywhere where bandwidth is expensive.

I was just talking to someone at a company I do some consulting with. I was working remotely, and the link was ridiculously slow. Ping times were at around 1 second, and sometimes 1.5 seconds. I could still work (i've got some techniques involving rsync, for very bandwidth starved links, and i just type ahead), but I could work better if the bandwidth weren't so slow.


So I talked about the serious need in corporations to take steps to block p2p, and then, since it's impossible to block it completely, probably, to do as much as it can to monitor p2p and then to have a policy about p2p use (probably that it should not be allowed at all, and that it would be blocked and monitored, and violation would affect performance reviews).

That may sound draconian, but it's necessary.


  1. bandwidth costs money. even if it were cheap, if peer to peer didn't soak bandwidth the company wouldn't need that much bandwidth and could contract for less, thus paying less every month. That's money that goes straight to the bottom line.

  2. the company i'm using for my example runs its own publicly accessible mail and web servers and therefore their bandwidth is all fixed IP. That's a bit of a bug on the part of IT management, they could go with 80% dynamic IP bandwidth and then 20% fixed IP for mail and web. They would save quite a bit of money right there since fixed IP bandwidth carries a very high premium in the philippines. they would save more money just by buying dynamic bandwidth for staff time-wasting surfing and buying less fixed IP bandwidth for those services that require the bandwidth.

  3. in a litigious world, it's for the company's good that peer to peer is blocked and violations monitored and punished. The same company has received a warning letter from a RIAA/MPAA related agency, apparently someone had left their bittorrent client on and had been downloading and serving enough files that they attracted someone's notice.


Naturally, this sort of thinking won't sit well with employees. But frankly, I don't think it matters. The staff aren't being monitored for wasteful surfing (of which, perhaps half of all surfing at the office is wasteful and not work related or only very peripherally work related), so their surfing for entertainment is a free benefit of employment. It's only fair that those online activities which might be damaging to the company be disabled so that other online activities of neutral or only mildly negative value may be allowed.