Wednesday, January 26, 2005

Normal distribution

I had to write a program that needed to simulate random data that was normally distributed (bell curve distribution).

Fortunately, I was able to grab a copy of Numerical Recipes in C" and found some code on p. 217. The code produces a normal distribution in the range -1.0 to 1.0 with a standard deviation of 1.

My program was PHP, so i translated it from C (K&R!, boy that's old :)

The PHP source is here

Unfortunately, I'm not a mathematician and I'll need to find ways to adapt that code or find some other code where I can adjust kurtosis, standard deviation, etc.

I'd look in Knuth's Seminumerical algorithms, but it might not be there. And anyway, my copy is in Mindanao. I won't be able to refer to that until I go in May.

Friday, January 21, 2005

Email from idiots is spam

I use gmail, and the spam marking feature, and the fact that it's so easy to use, is very nice.

Every once in a while I see vacation messages posted to mailing lists. Every single one of those I mark as spam in gmail. Partly I do that because people who don't know enough to set selective filters on their vacation messages are too dumb to listen to.

The gmail filter will learn from vacation messages which words score high as spam and perhaps future vacation messages will be marked spam and I'll see less of them. Also, the authors may start to score higher as spammers. That's a good thing too, for me. I'll see less of their mail since their mail will automatically go to the spam mailbox, and when I go in there to confirm which emails are spam, I get a chance to despam those emails which are important.

I don't think I've seen gmail do that yet though (filter mainly on the sender's email address), I've seen Bob Reyes' spams about his hosting service end up in the spam mailbox, but that's just because the email was spam, not because the emails were from him.