Why blog

When asked why he wanted to climb Mount Everest the climber George Mallory answered,

Because its there

This was a fantastic answer and I wish I had an original one liner to answer the original question. My paltry answer is:
I blog because I blog.

Google Premium Adsense

I currently use Google Adsense and was wondering just how many hits a second I would need to be receiving in order to apply for their Premium service
You need to be getting 10 million content views per month. This is not the same as hits because images, spider hits etc don’t count.
Last month I got approximately 1020358 hits which sounds impressive but it isn’t that great because this only got me 102,636 content views on google adsense as per their own stats. So approximately every ten hits I get 1 content view.
Using some advanced mathematics based on our very accurate assumptions 😉 This means I need approximately 100 million hits a month to achieve 10 million content views. This is of course pie in the sky.
This is about 40 hits per second. I doubt very much if my little sever would be able to withstand that kind of usage 😉 I suppose if I was getting that many hits per month then I would be able to afford a whole rack full of equipment.

Emotive entries

Is it just me or do more emotional entries on blogs get more hits. The reason I ask this is due to an entry I made just recently.
I’ve had more comments on that one topic than any other. In fact, its been the most popular topic on my blog for a couple of months. Admitedly this is not a huge achievment but I have found the more controversial or emotional the entry the more comments it seems to collect.

FT Search Engine Spam

Just noticed the following on Jeremy Zawodny’s blog. This is a pretty shit way to make money.
Basically moneysupermarket.com have paid the Financial Times (FT) to add some hidden links to their website. Of course when Google finds the link it will assign a hefty amount of PageRank to the page that’s linked to (FT has a PageRank of 8). I wonder how much the ask for this type of service.
Google’s Quality Guidelines specifically states:

Quality Guidelines – Specific recommendations:
* Avoid hidden text or hidden links.

I wonder what the penalty is (if any) for this sort of search engine spam. I don’t do it for fear of loosing what little Ranking I have with Google. I suppose I could report them but I doubt anything would be done to the FT.

Original Story

Upgrading Woody to Sarge

Debian have just released sarge as the new stable distribution which means I need to start upgrading my machines. Before I upgraded the main machine though I decided to run through it on a UML machine first to see where the gotchas are.
Things to watch out for
1. The sshd config file gets an extra parameter added to it ie.
UsePAM yes
When I tested ssh after the upgrade I was unable to log into the machine and I found that I had to remove this option or set it to “no” to get ssh to work. This surprised me because I was not expecting Debian to modify config file without my knowledge.
2. The RECORD option is no longer valid
xinetd[5691]: Bad log_on_failure flag: RECORD [file=/etc/xinetd.conf] [line=13]
xinetd[5691]: A fatal error was encountered while parsing the default section. xinetd will exit.
xinetd[5691]: Exiting…
this prevents xinetd from starting up. This was also noticed while testing the machine after the upgrade.
What follows is roughly the files the upgrade offered to update to new versions. Since I had modified most of these I selected the default option which is “N” ie do not upgrade to the maintainers version. This may have been the reason the ssh upgrade wasn’t too smooth. However I would not recommend taking the maintainers version if you have customized the files.
Configuration file `/etc/pam.d/login’
Configuration file `/etc/securetty’
Configuration file `/etc/pam.d/passwd’
Configuration file `/etc/bash.bashrc’
Configuration file `/etc/init.d/sysklogd’
Configuration file `/etc/services’
Configuration file `/etc/init.d/bind9′
Configuration file `/etc/bind/named.conf’
Configuration file `/etc/bind/db.root’
Configuration file `/etc/init.d/xinetd’
Configuration file `/etc/xinetd.conf’
During the upgrade you may also be asked to add any users and groups that are
missing from the default debian lot. The following bits are just the output of some other configurations options and warnings of things that have changed between the woody and sarge.
Configuring ssh
Environment options on keys have been deprecated This version of OpenSSH
disables the environment option for public keys by default, in order to avoid
certain attacks (for example, LD_PRELOAD). If you are using this option in an
authorized_keys file, beware that the keys in question will no longer work
until the option is removed. To re-enable this option, set
“PermitUserEnvironment yes” in /etc/ssh/sshd_config after the upgrade is
complete, taking note of the warning in the sshd_config(5) manual page.

Configuring man-db
This version of man-db is incompatible with your existing database of manual
page descriptions, so that database needs to be x rebuilt. This may take some
time, depending on how many pages you have installed; it will happen in the
background, possibly slowing down the installation of other packages. If you do
not build the database now, it will be built the next time
/etc/cron.weekly/mandb runs, or you can do it yourself using ‘mandb -c’ as user
‘man’. In the meantime, the ‘whatis’ and ‘apropos’ commands will not be able to
display any output. Incompatible changes like this should happen rarely. Should
mandb build its database now?

So far I have upgraded two machines and had the same trouble with ssh and xinetd both times. If I encounter any mmore trouble I will add more here.

Blogs SpamAssassin and Trackbacks

I disabled the trackback facility on my blog months ago because I was getting a lot of trackback spam. Around the same time I wrote a SpamAssassin Plugin for Movable Type. I effectively took MT-Blacklists regex database and converted into a form compatible for SpamAssassin and then wrote the plugin. Of course at the time I had disabled trackbacks so I only wrote it to handle comments and it has been going a great job because I get virtually zero blog spam now that the database is trained.
Of course now that I have turned on the trackback facility again I now have trackback spam to deal with. Of course this time I am not going to forget about it so await an update and I will release anther version that will handle trackbacks as well.
As promised this is the extended entry. I have now just added trackbacks to the spamassassin plugin. It was easier than I thought. It took 2 hours to finish it, now all I need is someone to test it. As soon as I have packaged it up into a tar ball I will release it.

df reports wrong size

I had a weird problem the other day when I seemed to be getting inconsistencies between du and df. The two command where in disagreement about what the disk usage was on my box.
thing:/# du -hax –max-depth=1 /
104M total
thing:/# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 3.7G 3.4G 118M 97% /
The first thing I thought was that there was a corruption on the partition table or some other terrible bug. I asked around and got no answers so I went Googling. I came up with nothing although there seemed to be plenty of people with a similar problem.
It then dawned on me what I might have done. I had originally created the Postgres database under /var/lib/postgres under the root file system. As this database got bigger and bigger I had to move it onto its own file system and mount it there. What I had forgot to do was remove the files from the root filesystem after I had confirmed the move was successful. This meant that 2.4Gb of disk space had not been freed on the root file system. Of course when you use du it only adds up the sizes of the files it sees whereas df reports the device usage. So there where no bugs in this case just simple human error.
thingthong:/home/harry# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 3.7G 1.1G 2.6G 31% /

Spell Counterfeiting

Having read Bruce’s article on Belgium’s latest way of trying to reduce counterfeiting I got the impression he’s not particularly impressed and I don’t blame him.
Professionals will spot the speling mistakes, particularly now that they know they should be looking for them. They might be thinking that several foreign languages might fool the bilingual counterfeiter but its hardly rocket science for the more accomplished crook to refer to bablefish or a foreign dictionary or some other external reference…….. oh…oh perhaps a real card!
It would appear that Belgium may be assuming their counterfeit operations are run by a bunch of half wits who only speak “da wocal wanguage”, have no access to the internet and are not particularly well educated.
I suppose if every card had a semi-random spelling on it then it might introduce a bit more labour on the part of the counterfeiter but that only works until they figure out how to scan it and of course you’re then back to the gimpy race
I would love to see some stats on the success rate of this new idea.