Chrooting Squid, Apache and Perl

Is fairly straight forward.
You will need to be able to use the following commands with some confidence
ldd
strace
rsync
cp
Tips. When copying files make sure your umask is set to 022 and alias cp as follows:
alias cp=”cp -p”
If you are copying over any perl XS files ie *.so files make sure you also use ldd on these. As an example the PostgreSQL drivers require:
ldd usr/lib/perl5/auto/DBD/Pg/Pg.so
libpq.so.3 => /usr/lib/libpq.so.3 (0xb7fbf000)
libc.so.6 => /lib/tls/libc.so.6 (0xb7e89000)
libssl.so.0.9.7 => /usr/lib/i686/cmov/libssl.so.0.9.7 (0xb7e58000)
libcrypto.so.0.9.7 => /usr/lib/i686/cmov/libcrypto.so.0.9.7 (0xb7d59000)
libkrb5.so.3 => /usr/lib/libkrb5.so.3 (0xb7cf1000)
libcrypt.so.1 => /lib/tls/libcrypt.so.1 (0xb7cc4000)
libresolv.so.2 => /lib/tls/libresolv.so.2 (0xb7cb2000)
libnsl.so.1 => /lib/tls/libnsl.so.1 (0xb7c9d000)
libpthread.so.0 => /lib/tls/libpthread.so.0 (0xb7c8e000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x80000000)
libdl.so.2 => /lib/tls/libdl.so.2 (0xb7c8b000)
libk5crypto.so.3 => /usr/lib/libk5crypto.so.3 (0xb7c68000)
libcom_err.so.2 => /lib/libcom_err.so.2 (0xb7c65000)
A quick way to find your shared object files is as follows.
find /chroot_directory_name/usr/ | grep perl | grep “.*\.so$”
You will already have copied most of the shared object files over while copying squid and apache but there are most likely a few extra ones you are going to need in particular if you are using the DBI.

perl -d:DProf

I have been running a simple search engine tool on UKlug and I have noticed that things are getting a bit sluggish due to the amount of jobs in the database (300K+). Its not an astronomical amount but the method I am using is starting to strain against the hardware. I am going to rewrite it (article for another day) but for now is there anything I could do to speed things up?
When something just isn’t running as fast as expected then its time to break out the Perl profiler. The search engine has a mod_perl front end which is the first pain in the ass. I am fully conversant with the mod_perl performance tuning guide but trying to profile mod_perl is not as straight forward as the guide suggests.
Luckily I always use modules for the bulk of the work on any cgi scripts so I created a mock script to call out to the modules and then ran the profiler against this as a stand alone program.

]$ perl -d:DProf mock_script.pl

This confirmed my suspicion that the main problem was database access. There are a couple of Perl functions that could be faster but tuning these when the database is such a bottle neck would be an exercise in futility. I know I have tuned the database to a point where it is not going to get any faster so everything is pointing at either a faster machine or a rewrite.
It just so happens that I have a faster machine to hand so running the offending SQL with timings on I got the following times.
Slow machine:
Time: 3003.434 ms
Fast Machine:
Time: 1683.190 ms
This is a marked improvement over the slower machine but it still a hellish time to wait for some results that have yet to be displayed. So how can I reduce the time taken to retrieve the results? More to follow.

Who’s searching on what

I noticed that someone had a look for gimpy on my blog today and I was wondering what terms people are finding my site with so I ran the following over my logs
perl -ne ‘/.*google.*?&q=(.*?)(&|”).*$/; print “$1\n” if $1;’ *.log | uniq
I am sure there is a shorter and better way to do it but this was more than enough to have a quick look.

Tie::RDBM::Cached

I wrote a Perl module several months ago. It’s entirely based on one of Lincoln D. Stein’s modules though I doubt my code is as neat as his would have been. I have to admit I am glad he did all the heavy lifting otherwise it might never have been released ( not sure if this is a good or a bad thing ).
Anyway, I completely forgot I had wrote the damned thing until Jenny mentioned it the other night. This was partly due to me moving PC’s and having several other things that have occupied me for a while (several months). Anyway, after a whole ten seconds wondering if I had actually checked it before release I decided to upload it otherwise I would have left it another several months. There are errors in it, of that I am sure but its out there and if one person finds it useful I’ll be happy.