SCSI or SATA

There seems to be a bit of confusion for some people as to what they should be using when it comes to hard drives.
There are those that will disagree with this and say they use SATA in a server or SCSI at home. For those then you probably know what you are doing so reading this is a bit moot.
If you are a home user then use SATA. Spending megabucks on SCSI would be a waste of your money and you are quite unlikey to see the benifits of it over SATA. SATA is fast enough for everything I have needed at home, and more.
If you are running a server that is on 24/7 and you expect it to remain that way for a very long time and you don’t want to be called ot at 4am in the morning to replace a disk, use SCSI.

ia_archiver DOS attack

I had to send alexa the following message tomnight after suffering a Denial Of Service attach from Alexa. They have been hitting me about 5 times a second for quite a while now and it does not look as if it is going to ease up.

Your ia_archiver robot is hitting my website 5 times a second.
www.uklug.co.uk
This is crazy, please stop it now. I have now added it to the robots.txt file. Don’t you know that a what you are doing to my server is against the law. I can hardly use it at all. Get seomeone in the tech department to fire the idiot who wrote/runs that spider, their clueless.

I blocked them using mod_rewrite and my advice to anyone that has more than a few pages would be to do the same. They are not playing the same ball game as google and yahoo, they obviously don’t give a damn about the people on the recieveing end of these attacks. Put the following in you .htaccess file.
RewriteCond %{HTTP_USER_AGENT} ^.*ia_archiver.* [NC]
RewriteRule ^.*$ – [F]
They recommend adding the following to your robots.txt file.
User-agent: ia_archiver
Disallow: /
I have heard that it is sometimes ignored though so my advice it to block them for good using mod_rewrite. They are only leeching the data or their own gain anyway. At least google forwards you some traffic.

lamicrogroup rip off

I just came across www.lamicrogroup.co.uk tonight and had a look their prices and its more expensive to buy from them than it is to go buy from the dell outlet.
Much to my astonishment I then came across www.lamicrogroup.com which has the exact same products and the exact same website except the prices are in dollars. What knocked me for six is that all the servers are half the price of the uk models.
This begs the question, are they ripping off uk customers? It certainly looks like it to me.

Rack Space and Servers

Are a pain in the ass to find. Actually I should narrow that down a bit. It’s hard to find cheap rack space and cheap servers that are in or close to London and from a company or someone you can trust. There are various offerings on the market most of which are downright rip offs. Its amazing how the prices can differ between 2 suppliers in RedBus or how the prices of 2 server can differ.
Out of curiosity I priced up a quarter rack ages ago (Aug 2003) to see what I could get and the cheapest was £3300 with a 1Mb connection and power. This also came with a £350 setup fee. I have had a look around a few forums and the cheapest I could see tonight was around £3000 with a similar setup fee. It would seem that rack space is holding its price regardless of how many people seem to be in on it.
I only need 1u or 2u to start with and the prices are quite expensive. I am currently looking at UKFSN because I know people with a box there and the guy who runs it (Jason Clifford) has a very good reputation. The proceeds also go to help the free software movement which is something I am interested in. I have seen a few cheaper than this but none by any great margin, at least not yet.
I also need to get a nice 1u possibly 2u rack mount server. I like Dell because parts are fairly cheap and common on ebay but I also like HP which are not cheap and parts are not as common on ebay. If the HP goes duff there is a good chance you will need to go back to the manufacturer and when you do make sure you visited Boots for some ky jelly.
As for which one is more reliable I would need to speak to several unbiased sysadmin’s who had been using both for years. There are good and bad reports for both online.
Given a choice between the 2 I would pick a dell simply because there are more spare parts floating around, although like HP, if you buy direct don’t forget to visit Boots.

Backup to CDRW

For too long now I have been lazy with my backup procedures which was normally a quick rsync to a different hard drive. This is hardly ideal and up until now I have been quite lucky. I also tend to have a blast every so often and burn some bits to CD but I have been doing a lot of work lately and I know what its like to loose a few days worth of it.
For those with the luxury of a large Lacie drive or a decent tape drive then deciding what to backup is relatively easy. I decided to limit myself to a single CDR-W which is a measly 650Mb. I did this because I am a hoarder and it was about time I cleaned house. Besides, if push comes to shove I have a 20Gb drive that is not plugged in due to noise so if the backup takes more than 650Mb then I will use it rather than risk loosing the data.
I have four users on this machine that I use to do all my work, all other users are used by the system and for the most part I am not worried about the data they generate. I cannot backup each users entire directory because there is too much data do I limited myself to certain directories in the users home directory.
First thing I did was create an area where the backups are going to take place. I have 19Gb free in one partition on an SATA drive so thats where it’s going.
The basic idea is as follows.
1. Determine which directories in each users home directory are important.
2. Determine if there are any files outside of these directories that are important.
3. Move those files inside one of the directories.
4. Weed any crap from the directories, either delete it or move it out to another area on disk.
5. Write the backup script.
For step 2 there are some files outside these directories that I would like backed up. Some examples are.
/boot
/etc/
/var/spool/cron
/var/spool/mail
Backups script itself.
The backup script does the following.
rsync -av user1/dir1 rsync_dir/user1/
rsync -av user1/dir2 rsync_dir/user1/
rsync -av user1/dir3 rsync_dir/user1/
tar -czvf user1.tar.gz tar_dir/user1/
………
……… DO THE SAME FOR ALL USERS
………
rsync -av /var/spool/cron/crontabs rsync_dir/system/
rsync -av /var/spool/mail rsync_dir/system/
……… rsync each system directory
tar -czvf system.tar.gz tar_dir/system/
I have deliberately created separate tar.gz files because its easier and faster to extract them on an individual basis where we just want a couple of files out of the backup. One thing to note is that when you tar the files up you want the paths to be the paths to the original files on disk not the rsynced files that we just copied. This is for sanity checking later.
To create an image of the tar files we need to make an ISO image of them as follows.
mkisofs -r -J -l -o backup.iso tar_dir/
Once the iso has been written we need to make sure that it is not too big
( < 650Mb use "ls -lah") and then we can burn it to our CDR-W as follows.
cdrecord -v blank=fast
cdrecord -v speed=8 dev=1,5,0 backup.iso
Note that I am blanking the CDRW before I write to it. The above script is now being managed by cron so I no longer need to worry too much about the backup as long as I have a cdrw in the disk. The next step is to check to make sure that the backup actually worked.
The best way to do this is to use tar.
From the man page:
-d, –diff, –compare
find differences between archive and file system
cd rsync_dir/user1/
mount /cdrom/
tar -zdf -diff /cdrom/user1.tar.gz
If you don't see anything then you have a clean backup from the file system. If you are unsure if anything happened edit a file on the file system and try it again. You would get a similar message to this if all you do is change the mod time of the file
bin/document_parser.pl: Mod time differs
bin/indexer.pl: Mod time differs
Next thing we should do is start rotating the media so I am off into town tomorrow during my lunch break to get a couple more rewritable CD's.
The above is a very simplified version of what I have done. There are lots of options to rsync and tar that can make things much easier so go have a look. I also have some websites not on the local machine that I am doing manually.
I also need to get myself another large disk for the machine. I would ideally like to use SCSI but it's very expensive. Another big SATA drive may just be what I am after or perhaps one of those Lacie drives…………..

More RSS Job Feeds

I managed to find another 3 rss job feeds today for the RSS Jobs website.
That takes me back up to 19 job feeds in total. I had to take 3 of the feeds off earlier in the week because they were just duplicate jobs so I decided to go looking for some more.
As usual if you find an rss job feed please let me know and I will add it to the database.