Wednesday, April 25, 2007

graphicsmagick and photobucket

I just discovered photobucket and have been uploading pictures there. I like that I can't even find (in some desultory FAQ searching) what their bandwidth or space limits are. And I really like that I can display those images from elsewhere (can't do that with flickr, need to surf to the flickr link).

I have no idea how these guys make money, but there must be money being made somehow.

When uploading images though, I've just been pushing up 2.5MB images (because my camera is set to take pictures at a very high resolution, have 1GB, will use it all ;-). Even at that high resolution I rarely ever run out of space on the memory stick. I've run out of battery 3 times more than I've run out of memory.

But it's impolite to push up images that large. I don't even dare show them on my other me because they're just too big. It took me all of five days to get up the initiative to install graphicsmagick. I used to use imagemagick and graphicsmagick is a fork of that project. Imagemagick is still around, but I'm just testing stuff out and decided to download graphicsmagick instead.

So now it's easy to increase the jpeg compression while at the same time scaling images to a lower resolution. I haven't decided yet what resolutions to use for thumbnail size and medium size, I'll look around and figure that out tomorrow. I'll ask around too, for what people think is a good "standard" thumbnail size.

mkdir medium
cd medium
for fn in ../*.jpg
gm convert -scale 800x600 -quality "75%" $fn `basename $fn`

cd ..
mkdir thumbnails
for fn in ../*.jpg
gm convert -scale [xw]x[yh] -quality "75%" $fn `basename $fn`

and just have a script to rotate left or right.

Tuesday, April 24, 2007

counting your money

I see that US$20 bills have RFID. And so do some euro banknotes.

I don't care that much since I don't intend to visit the U.S. or use U.S. currency for much of anything (if I do, it'll be to receive payment and convert it immediately). And I doubt if I'll ever have euros in any interesting amounts.

What I do wonder is, is it going to be possible for someone walking (or someone with an RFID long range gun) to count how much money is in someone's wallet? Or if not how much exactly, at least count the number of bills there?

Do all the bills radiate the same signal? If they do, then what help would that be with respect to counterfeiting? It shouldn't be hard to create a counterfeit that would radiate the same signal (it might be too expensive today, but I bet 10 years from now it'll cost nothing). And if the bills don't all radiate the same signal (e.g., they radiate something correlated to the bill's denomination and serial number), then it might be possible to deduce how much money is in one's wallet. If it's encrypted, that won't matter much. Banks will need to know how to decrypt it, and once it's in a few hundred bank branches, one of those decrypting boxes is going to get stolen (might even get stolen during a bank robbery) and reverse engineered.

Of course, though, it wouldn't be bad, in the eyes of any government, if people were to shy away from cash transactions and go with electronic transactions instead. Taxes are easier to collect that way.

Wednesday, April 18, 2007

I get pissed off when I look at old code like this:

1. No vertical whitespace between the pg_exec and the if.
Alright, the developer was just ignorant and was following the
bad standards in his environment.

if (pg_numrows($rs)==0) {
} else { # if (pg_numrows($rs)==0) {

(dammit, I want to change my title because of idiots like this).

DID NO ONE TEACH YOU != OR <> (yech, but at least it does
exist in PHP).

That code is wrogn anyway, pg_numrows will return -1 on error
(e.g., network goes down, db server reboots, db server daemon
dies, connection disconnected because the server restarts because
someone did a kill -9 on a client).

3. I'm sorry. K&R indention is dumb. Yes, it saves lines, but
we're not in the age of teletypes and 300baud modems anymore,
there's no great need to save screen refreshes. vertical space
is valuable, sure, but clarity and maintainability is more

if (pg_numrows($rs)>0)
--- read the rows and do something with them

and suddenly there's no else to clutter things up.

There are a lot more stupidities in that program, but this scheisse makes me tired. If that developer were still around I'd fire his ass. Yes, I'm looking at you, A.M.

gmail problem

It's close to 8AM in the Philippines now, so around midnight UTC. We just had a short brownout. It's probably because of summer airconditioning power demand. Last night we had a short (less than 10 minutes) brownout too. This is better than when I came back to the Philippines in 1993 or so when there were 8-12 hour brownouts. Summer nights were terrible then.

I was in the middle of drafting an important (timmy's baptism requirements) email on gmail when the power cut out. Now that it's back (again, less than 10 minutes), I can't get to gmail. and here, blogger, are fine. Gmail just isn't replying though. It isn't a DNS issue, the domain resolves. It just isn't replying yet. Ah well, i'll send myself the incomplete draft from one of my other web-based free email addresses.

Friday, April 13, 2007

pgsql COPY doesn't fire rules

I've got a very large database at $DAYJOB. Two and a half years worth of data takes up around 500GB. The largest table is around 80GB. I use postgresql, and with the right indexes, performance is very good. I've sort of become a specialist at database performance tuning since I've never had the benefit of working with anyone else who could avoid sequential scans on large tables.

So raw query or insert/update/delete performance isn't a problem. However, database management *is* a problem. On rare occasions, I need to vacuum a table, or many tables. That's very hard to do when the table is so large (I stopped the vacuum after 2 hours). It's the same thing when I have to backup a table. pg_dump -t takes a very long time to scan a very large table.

So I decided that I needed to partition the larger tables. Perhaps into per-month tables. My program to load data into the database uses COPY because that's the fastest way to load data into a table. Unfortunately, COPY doesn't fire rules or triggers, so the ON INSERT DO INSTEAD rules didn't fire and the data was still going into the base (parent) tables.

Just yesterday I was scanning through the pgsql-general list and saw Tom Lane say that COPY doesn't fire rules. It didn't strike me as significant at the time since I was thinking of something else. Tonight though, as I was working on it again and, again, finding that rows were going into the base table (or the whole transaction failing, since I decided to add a constraint on the parent so that no rows could insert into it at all), I finally put it together and knew why COPY wasn't doing what I wanted.

It was a simple thing to replace the COPY code with insert statements. Inserts will run more slowly now, (although I might get some performance gain by upgrading to 8.2 and using the new multi-row insert syntax) but the database will be much more maintainable. It'll probably run faster too since most queries will be against the most recent months, so indexes and whole tables for the most recent months will be more likely to fit in the OS cache/buffers and postgres' shared memory.

In any case, I'll keep the old non-partitioned database and the new partitioned database, inserting data into both. When I prove that the new partitioned database is stable and faster, I'll retire the old non-partitioned database.

My last concern is aesthetic. With new rules being added every month (and 2 or more years of data being loaded) a \d on a base table will yield a very ugly list of ON INSERT DO INSTEAD RULES. There's nothing to be done about that though, unless maybe I move the parent tables into a schema and replace them in the public schema with views :-).

Thursday, April 12, 2007

gmailfs slow to connect - fix

I installed gmailfs using apt-get (on my Ubuntu Edgy laptop) because I thought it'd be interesting to see what file format gmail stores emails in. And I thought I'd download all my spam too, to see if my bogofilter classifies them correctly, and for those that it doesn't, train bogofilter with.

I thought there was something wrogn because it just wasn't mounting. A look at gmailfs.log shows:

04/12/07 19:28:15 ERROR gmailfs did not connect in less than 12 seconds, aborting...
04/12/07 19:28:15 WARNING Child process 24621 received SIGHUP, exiting...
04/12/07 19:28:15 INFO Successfully reaped child 24621

Clearly it was timing out. I tried a few more times and finally edited /usr/bin/mount.gmailfs. There's a define right there for the GMAILFS_MOUNTING_MAX_DELAY so I just set that to something higher (64, from the default of 12). Running mount.gmailfs under time shows that it's taking around 30s to complete.

Ah, except the filesystem looks empty. I can't see my email there. Copying files there makes email appear in my gmail Inbox, but I can't get my spam. Ah well. It's still cool, but I'm not sure I can see what use it's going to be to me :).

Tuesday, April 10, 2007

Who would win?

Somewhere at around 1/6th of the way down this Slashdot article Neal Stephenson describes a battle between himself and William Gibson.


In the first instance,

You don't have to settle for mere idle speculation. Let me tell you how it came out on the three occasions when we did fight.

The first time was a year or two after SNOW CRASH came out. I was doing a reading/signing at White Dwarf Books in Vancouver. Gibson stopped by to say hello and extended his hand as if to shake. But I remembered something Bruce Sterling had told me. For, at the time, Sterling and I had formed a pact to fight Gibson. Gibson had been regrown in a vat from scraps of DNA after Sterling had crashed an LNG tanker into Gibson's Stealth pleasure barge in the Straits of Juan de Fuca. During the regeneration process, telescoping Carbonite stilettos had been incorporated into Gibson's arms. Remembering this in the nick of time, I grabbed the signing table and flipped it up between us. Of course the Carbonite stilettos pierced it as if it were cork board, but this spoiled his aim long enough for me to whip my wakizashi out from between my shoulder blades and swing at his head. He deflected the blow with a force blast that sprained my wrist. The falling table knocked over a space heater and set fire to the store. Everyone else fled. Gibson and I dueled among blazing stacks of books for a while. Slowly I gained the upper hand, for, on defense, his Praying Mantis style was no match for my Flying Cloud technique. But I lost him behind a cloud of smoke. Then I had to get out of the place. The streets were crowded with his black-suited minions and I had to turn into a swarm of locusts and fly back to Seattle.

and, yes, there's quite a lot more ;-)

I haven't read all of Stephenson, although I read snowcrash (+5) and cryptonomicon (-3) several times. I might eventually read all of Stephenson's work, but I'm in no hurry. Jerry gave up on quicksilver, I think, at some point, with some disparaging comments on pace, continuity and point. Which is why there's no hurry. I don't agree with some things Jerry says. For instance, he says that U.S. nuclear waste should be dropped down into the subduction zone in the Surigao deep, without compensation to the philippine government, or even coordination or permission. That's more of the same thinking which brought about the Iraq tragedy (where U.S. intervention has killed more than half a million people and led to an exodus of the most productive parts of the Iraqi population). I'm aware that he thinks Iraq is a mistake, but that's just because the U.S. can't win in Iraq. In the Philippines he calculates that the U.S. can bully the government without much of a substantive cost, so he's willing to drop the nuclear waste in philippine territory without payment.

Well, Hell no. if the U.S. wants to drop nuclear waste here, LET THEM PAY FOR IT. And ask for the privilege. It may be completely safe, but we don't care. PAY, or keep the poison on your own land.

Crap, this post got away from me and turned into an anti-US screed again. Ah well, never mind. The U.S. certainly deserves it.

Saturday, April 07, 2007

Greylisting -- bad

At there's a short article discussing why greylisting is a bad thing.

I've always thought it wasn't much more than a stopgap. Marco says that his beef is the fact that email is no longer instantaneous because of the delays that SMTP graylisting injects. I agree that's important, but it's not that important to Me since I never send anything urgent in email (I'm aware of graylisting and delays) and I let my mail sit and ferment a few hours, usually, before reading it. Immediacy is important to other people, but it's not a big deal with me.

The reason it's a stopgap is that eventually botnet spammer software and viruses will be modified to retry. When that happens, we'll be back where we started, with spam classifiers.

Likely, then, there'll be some new mode of speeding things up, perhaps by putting fast but not so good classifiers in the front of the queue, passing through anything that clearly isn't spam, and passing the possibles to the heavyweight classifiers like spamassassin (which I can't stand because it's so heavy, I use bogofilter instead, it's simple and I don't mind training it from the command line, but that really doesn't scale and there should be an easy way to train so that non-geeks can use it without trembling).

Friday, April 06, 2007

Advice for young programmers

Jeremy Allison has good advice to himself, if he could travel back in time.

  • If it's not what you love, don't do it

  • Learn the architecture of the machine

  • Reputation is important

  • Proprietary environments are a trap

  • The network really *is* the computer

  • The community is more important than your employer

On an unrelated note, and a bit late for April 1, I was reading about how J Striegel fails the Turing Test. That's amusing in itself, but I was fantasizing about modifying gaim so that whenever it sees "cutie" or "sexy" it replaces the words with "moron" or similar. After all, what kind of people have cutie or sexy in their logins? This would be for my own amusement only. Hmmm, might be interesting to modify evolution the same way. If I receive mail from anyone with cutie or sexy in their email address, it's not like I'd actually want to reply to them, after all. And if I did, it would be a good thing if my mailer were to make any such attempts fail anyway.