Thursday, August 31, 2006

Erlang concurrency discussion

Joe Armstrong has a good and interesting discussion of concurrency in Erlang. One point being, if the multiple/concurrent processes don't share memory but instead only send messages to each other, concurrency is easy and robust. Certainly, it's race conditions and incorrect locking of shared resources that makes concurrency difficult in languages which share memory.

Recently too, Joel had a discussion of MapReduce which, apparently, is something google uses to parallelize massively. Which is related to the Armstrong's article since the concurrent processes don't share memory either, they just work independently allowing massive concurrency scaling pretty much linearly as the resources.

I saw a blog post that said that Google doesn't work that way (and that discussed how map/reduce actually works in terms of the, I think, C++ API). I've lost that link though. I don't know if I ever bookmarked it, and if I did, well, I work on around 5 computers. So it's somewhere, but I don't know where it is. I could use Google Browser sync I guess. But I'm not convinced. I'm a bit paranoid about it (mainly about keeping passwords in there). I'm told that that's optional (a feature I'd expect from such an excellent company as google, certainly. But I'm still thinking about it. Oh, I'm sure I'll capitulate, eventually. But there's going to be at least one tulog before I do so.

Alright. and that's quite enough randomness already.

Monday, August 28, 2006

Libraries, books, Wow

I don't quite care for the title, but yeah, those *are* Lovely pictures of books and libraries and stacks. I would love to visit each of those at least once. For a year each :-).

Sunday, August 27, 2006

per-site color preferences in mozilla would be nice

I was surfing over to phpPatterns and I couldn't read it. It was in some sort of white over green font that was hard for me to read. At least it wasn't white (or light yellow) on black. But it's still not pleasant to read.

On the other hand, I *really* want to be visiting there regularly. I didn't want to have a separate firefox profile just to view that one site (or those sites which are hard to read due to their color schemes). I might still do that, but I'd rather avoid it if possible. So I thought I'd just go into Preferences|Content|Colors and force all pages to display in black on white.

That works OK. It messes with the tabbed style on, for instance, blogger, so that the line between the editor form and the Edit Html and Compose tabs is not there. I'm sure it messes with lots of other stylish pages too. I think I'm liking this though. I'll stick with it. If I start missing the styles on other pages I may go back, and then I guess I'll read phpPatterns on My RSS reader instead. I don't think that works right though, since I'd need to follow links, and those links wouldn't be in bloglines but would be straight to phpPatterns, and the problem re-emerges.

I would be nice if site color preferences in firefox were configurable per-page. So that when someone's pages goes unreadable, I can force the colors, but just for that page.

Thursday, August 24, 2006

Re-engineering: Assertions versus Unit Tests

I'm working on re-engineering a whole lot of badly designed code at work. The code has been mostly working for 4 to 5 years (some of it seems to have been working since 2000 even, if I can go by the author:date comment at the top). Part of the problem is that many things have changed in the business since (any particular/the) code was written. Another part is that the database design was optimized for uploading (easy to identify all the new records for the day to be uploaded), but it is incredibly anti-optimized for querying. There are around 24 basic transactional tables (it could be whittled down to around 12-15, probably, that's another issue, related things should be together in one table) and every day, an additional 24 are created with the date (in mmddyyyy format, not even yyyymmdd) tacked on.

So creating a multi-day report involves running the same query over multiple tables and processing the data on the client side, or it involves a union over many tables.

The first part of the re-engineering involves creating suitable abstractions over this mess (and no, it's not fixing the table structures to be more sane, that comes later, there's too much other code that depends on the insanity). The abstractions will allow us to provide a more reasonable view of the DATA (hiding the database details) to higher (reporting) levels. This is at some cost in CPU processing though. We use PHP, and it's just not very efficient when shoveling a lot of data around while transforming it.

I'm working on basic classes that will interact with the database (everything else interacts with these classes, no higher level code will touch the database directly). I'm finding a problem though. The code aggressively checks parameters and (for anything complex), invariants inside the methods. Checking is done through assertions. At the same time though, I developed a UnitTest class (a simpler replacement for PHPUnit and similar). And when I develop unit tests, I find myself wanting to write tests that make sure the assertions are working. But the assertions can't be tested if assert.bail=1, and the assertions would get much longer if I wanted them to return failure codes to the unit testing framework (because assertions should be simple, just make the program fail immediately).

I would also like to know *which* assertion failed (my parameter checking is pretty paranoid, so there are many checks upon method entry, even if there's only one paranoid, it might even be that there are checks for object validity even if there are no parameters).

ifdef-ing (well, if(defined(....)-ing), the assertions when unit testing doesn't seem to be the right thing to do either. But I haven't really thought that through yet. Maybe I'll try that and see what falls out.

I wonder how others have resolved assert versus unit test (actually, assert versus the goal of complete test coverage, or close to it). Hmmm, I'll google that over the weekend, maybe.

Saturday, August 19, 2006

Jerry Pournelle's Byte column -- moved

It's too bad that Jerry Pournelle's contract with byte was not renewed. I've always enjoyed his columns (well, back when I could read them in dead-tree byte, and for a while, on free online byte). It's good to see that he's keeping the format though, and maintaining the site for himself at Chaos Manor Reviews.

I'd like to be able to subscribe, but that's not possible (the subscription is not inconsequential in the third world, even at my pretty good IT salary). I don't have a credit card, and even if I did, I'm not sure the transaction would be honored. Philippine credit card transactions are tricky.

I do hope that he keeps enough subscribers to make maintaining the whole chaos manor system worth his while. There are always good discussions there. I don't go there as often as I used to (due to Digg and Reddit, but it's always educational to swing by every few weeks to see what's being discussed.

Sunday, August 13, 2006

ubuntu wifi

I'm at Bo's coffee club in the Robinson's Galleria mall. I would put links there but frankly, I can't find the official websites with some very half-hearted googling. I'm running ubuntu on a toshiba laptop with an Atheros AR5212 802.11abg wifi card and ubuntu detected the wifi card flawlessly.

I also installed ubuntu on a winbook laptop with an Intel wifi chipset (it was a 1.5Ghz centrino system) and that was directly detected too. I'm not sure how ubuntu did that (with the intel system), since the intel firmware is required for that. I wasn't paying very much attention, just installing stuff and downloading packages from the network. Maybe it auto-downloaded the firmware upon detection. Or maybe it came with EasyUbuntu.

That probably cements my transition to ubuntu from mandriva. The main factor in the transition is the rate of development. Ubuntu has working svk packages (I never could get svk working correctly with Mandriva, although the packages are there) long before Mandriva does. And I never could get wifi working with Mandriva (although, to be fair, that was me being lazy, if I were younger and had more time and patience, I'd have figured it out long ago).

I never have been a linux fanboy. I'm a pragmatist. Whatever works better, that's what I use. Windows doesn't work for me since it's impossible to keep secure and, frankly, I can't afford the MS-Office license (windows itself I could afford, barely). Now, I *do* get a valid and legal MS-Office license with the laptops I purchase from the U.S., but then they have that pesky "legal only in the U.S.A." clause, so even if they're legal there, I can't use them in the rest of the world. At one point RedHat was the distribution to use (and before that, slackware). But now RedHat is moving in the direction of enterprise systems and I don't want to be in Fedora (where I would need to be if I wanted svk, etc).

I moved to Mandrake/Mandriva because of the superior package management, but now the speed of development isn't quite right for me. Ubuntu feels right. Although I wouldn't be surprised if I were to move on to something a bit faster later on.

Ah, but this post was not supposed to be about reminiscence. Instead, I was going to celebrate finally getting wifi working. I'm sitting in Bo's coffee club and for the price of a Cafe Latte I'm able to connect to Robinson's free wifi. The DNS is very slow (one reason why I don't have links above, the other being mentioned above), but surfing to this or that site, once it's been resolved, is pretty good. The DHCP server provides two IPs (the first of which does not reply to ping). That's a bit incompetent. But removing it from the list doesn't speed up DNS queries much (or at all, as far as I can tell). Once DNS resolves though, things are pretty good.

Connection to my office VPN is pretty fast (faster than at home), so I guess they use either meridian or PLDT (my home connection via Destiny Cable Internet takes a roundtrip overseas, so the VPN connection is reasonable, but not this fast).

OK. nmap -v says that the first DNS server *is* up, just not replying to ping.

I'm very happy to get wifi up and running, finally. Previously I didn't care very much. I still don't, really, since fast internet is available at work and at home. But it's a great convenience to be able to connect to wifi networks.

Next thing to learn, I guess, is how to connect to closed networks. It's been 4 years since I last had anything to do with that and it was a pain then (even with windows). I'm sure it's much easier now. Haven't done it at all though, yet. So there's going to be some learning involved.