Monday, July 26, 2010

JMeter Drupal Proxy URLS to exclude

I often use jmeter to load test drupal websites. One of the first things I need to do is capture a sample browsing session over the site using the jmeter proxy.

When I'm capturing a sample browsing session I usually don't want to grab all the embedded files since that makes for a very large set of http client requests in the thread group. At this point I want the thread group to contain just the top level URLs I actually clicked on but I want the individual entries to have "Retrieve All Embedded Resources" to be clicked.

That will increase the CPU load on the jmeter instances at runtime (they need to parse the downloaded file to extract the resources). I'm happy to make that trade for now. If it becomes a problem I'll adjust to have the embedded resources pre-extracted at proxy capture time but for most jmeter jobs I've done I haven't had to worry about test time CPU load much.

I always forget what the URL exclude patterns should look like. This is posted so I'll find it later.

Drupal sometimes adds GET parameters to URLs even for "static" resources such as css or png files. I haven't gone through to figure out which resources can have GET parameters added to them, instead, when excluding embedded/static resources I just treat them all similarly:

.*\.gif(\?.*|)
.*\.jpg(\?.*|)
.*\.png(\?.*|)
.*\.css(\?.*|)
.*\.js(\?.*|)

etc.

Thursday, July 22, 2010

CTEs for clarity (no efficiency gain here)

Some messages are sent to two kannels. I've got the essential data in a postgresql table but I wanted to find the messages which were sent to both kannels (within 5 seconds of each other, most such duplicated messages are sent within the same second, or within 1 second of each other).

The query could have been done without CTEs (using subqueries) but I prefer the CTEs since they move the subqueries "out" of the select statement, making the select much easier to read.

/* set up the CTEs although they're not really common except in the sense that they're the same statement, I'm just using them as *table*expressions* :-) */
WITH lhs AS
(
select id,kannel,tstamp,dest,msg_text from decmtmo WHERE mt_mo='mt'
), rhs as
(
select id,kannel,tstamp,dest,msg_text from decmtmo WHERE mt_mo='mt'
)
SELECT lhs.id lid,rhs.id rid,abs(extract('epoch' from lhs.tstamp-rhs.tstamp)),
lhs.kannel lk, rhs.kannel rk, rhs.dest,trim(rhs.msg_text )
FROM lhs,rhs /* this is what improved, otherwise we'd have the subselects here */
WHERE lhs.id<>rhs.id /* make sure we don't look at the same row on both sides */
AND lhs.dest=rhs.dest AND lhs.msg_text=rhs.msg_text /* MT identity */
AND lhs.kannel<>rhs.kannel /* but different kannels */
AND lhs.id>rhs.id /* avoid showing two copies of the same row, with lhs and
rhs swapped */
AND 5 > abs(extract('epoch' from lhs.tstamp-rhs.tstamp))
/* within 5 seconds of each other */
ORDER by lhs.id,rhs.id