http://www.hughes-family.org/bugzilla/show_bug.cgi?id=1785
------- Additional Comments From ***@pathname.com 2003-04-15 19:57 -------
Subject: Re: [SAdev] make DNS cache generic, change MX query tests to use DNS cache
I should note that I do have a local DNS cache (bind 9), but I suspect
there are a couple of issues:
1. It seems like some lame (or down?) server failures don't get cached at
all, I'm not sure why this is the case. For example: "host -v
208.229.236.14" *always* takes 10 seconds on my machine.
2. Some DNS blacklist lookups seem to take longer than other and it almost
seems to be more per the IP address being looked-up as opposed to the
DNS blacklist server.
I set-up a small test bench of 250 recent spam and 250 recent ham. I
repeated the following test three times (once to prime my cache and then
twice to see how the cached version worked).
./mass-check --net -j 8 -f corpus.small
and I logged out the number of seconds per RBL query in the SA DNS code.
Then, I averaged the lookup time per IP address and also per RBL. The per
RBL version was fine, the averages were from about 1 to 3 seconds.
However, for the per IP address version, there were a few addresses that
had VERY high averages relative to the others.
Here are the top 20 lookup times (total time, number of lookups, average
time, *reversed* IP address):
run two:
5960 234 25.47 203.226.26.203
4012 3312 1.21 249.183.17.209
2173 1282 1.70 146.77.235.207
2081 1352 1.54 206.250.35.66
1989 78 25.50 248.226.26.203
1989 78 25.50 241.226.26.203
1989 78 25.50 178.226.26.203
975 39 25.00 79.216.44.207
751 504 1.49 173.76.146.129
688 351 1.96 12.179.185.208
626 585 1.07 15.35.17.212
514 351 1.46 19.134.144.129
499 473 1.05 207.163.71.64
402 273 1.47 43.98.18.192
376 195 1.93 44.240.36.66
370 234 1.58 181.89.141.203
340 216 1.57 51.14.68.9
323 180 1.79 240.211.103.216
312 273 1.14 26.229.108.195
303 195 1.55 45.1.146.129
run three:
5938 234 25.38 203.226.26.203
4243 3312 1.28 249.183.17.209
2094 1282 1.63 146.77.235.207
1989 78 25.50 241.226.26.203
1977 78 25.35 178.226.26.203
1971 78 25.27 248.226.26.203
1775 1352 1.31 206.250.35.66
975 39 25.00 79.216.44.207
654 585 1.12 15.35.17.212
653 504 1.30 173.76.146.129
615 351 1.75 12.179.185.208
556 473 1.18 207.163.71.64
461 351 1.31 19.134.144.129
423 234 1.81 181.89.141.203
392 216 1.81 51.14.68.9
391 180 2.17 9.99.120.67
390 156 2.50 240.28.13.206
367 220 1.67 2.29.209.63
337 216 1.56 55.183.195.128
328 273 1.20 43.98.18.192
Note there are 39 blacklists ATM since a ton are being tested. The really
bad ones with 25 second averages always take about 25 seconds.
Post by b***@hughes-family.orgPost by b***@hughes-family.orgThis is somewhat contradictory with the idea that we need backgrounded
DNS lookups. I'm not sure what to think, but I assume we had a
foregrounded version at one point, right?
We *did* a long time ago, but Marc Merlin added the bgsend() code -- ie.
we already have backgrounded lookups, as far as I know. For RBL tests
anyway.
I *meant* to question whether backgrounded lookups really help performance
or hurt it. I suspect they help, but I'm not 100% sure about that. Maybe
the code just needs some refinement.
Post by b***@hughes-family.orgNot sure if we need bg lookups for lookup_ptr(), lookup_mx etc., since I
would guess any decent nameserver will already have looked up a lot of
that data anyway and loaded the glue records, so they should be quite
fast. Caching that data is probably helpful though.
I've done a few simple experiments with caching A and TXT records in the
current code and it doesn't seem to help at all, although my experiments
only cached successful lookups, not failed ones.
Post by b***@hughes-family.orgBig spamd machines may gain a benefit from a local cache of DNS results.
But big spamd machines should have a local DNS cache running anyway,
in the form of a caching nameserver, I should think! so I'm unsure
if there's really a need to add more code to SpamAssassin there...
I doubt we want to add our own DNS cache, but I'm wondering about some
short-term caching of failed lookups now.
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf