PhpMyLibrary

I had a look at Koha as an open source library system we might use at work and I promised I was going to look at phpmylibrary the next day. Well I didn’t have time but I did manage to look at it just recently and here is what I found.
First off it installed very easily which was nice. We got it up and running with some problems ie we had to turn Globals on and under PHP this is normally considered a no no. This rung alarm bells in my head but I continued on.
Next thing I noticed was the code. It might be because I am used to Perl but the code just looked messy. This is no reason to judge it so I had a look at the main feature ie loading and understanding MARC21.
I could have saved myself a lot of time if I had noticed that they only support USMARC. I left a message on one of their mailing lists asking about the possibility of using MAR21 but heard nothing. Which was another bad sign i.e. from what I can tell its not a very active project.
Next thing I will be looking at is CDS ISIS which is a suite of tools written by United Nations Educational, Scientific and Cultural Organization (UNESCO)

Swoogle

It would appear that Swoogle is not being very polite to web servers. It seems to hit me from 2/4 times a second. I am probably going to ban it from UKlug because its just not very nice to hammer someones server as hard as that. At least Google is useful ie poeple find my site via google and it still manages to be polite about it. You would think that people doing research would be trying to be a bit more polite about what they are doing.
I have sent a couple of emails to the technical contacts and the the people running swoogle
If I don’t get a reply I will ban their entire subnet because they appear to be using different IP address’s to spider from.
130.85.95.109
130.85.95.23

Unique Ip address parser

Dean sent me the following to parse the logs and see how many unique ip address’s I was getting on a monthly basis.
grep ‘Nov/2004’ uk*.log | awk ‘{ print $1 }’ | sort | uniq | wc
I wrote the following in Perl which does the same thing but I think I prefer Deans
perl -ne ‘/^(.*?)\s/; $a{$1}++;} END{for (keys %a){$c++;print “$_ == $a{$_}\n”} print “$c\n”;’ ukl*ss.log
maybe we could
perl -ane ‘$a{$F[0]}++};END{for (keys %a){$c++;print “$_ == $a{$_}\n”} print “$c\n”;’ ukl*ss.log
or maybe even
perl -ane ‘$a{$F[0]}++} END{for(keys %a){$c++;} print “$c\n”;’ ukl*ss.log
Or we could just
perl -ane ‘$a{$F[0]}++;END{print keys(%a).”\n”;}’ ukl*ss.log
bollicks to this. I am also sure there is some clever one liner in Perl to do this but I hardly ever use them so I will leave it to the reader to beat it 😉

Who’s searching on what

I noticed that someone had a look for gimpy on my blog today and I was wondering what terms people are finding my site with so I ran the following over my logs
perl -ne ‘/.*google.*?&q=(.*?)(&|”).*$/; print “$1\n” if $1;’ *.log | uniq
I am sure there is a shorter and better way to do it but this was more than enough to have a quick look.

Movable Type SpamAssassin Plugin

I have just finished the beta release of MT-SpamAssassin and so far so good. I have removed MT-Blacklist and everything is fine. I have not built the Bayesian database up completely yet since I don’t have that many comments. If you want to try it you can download it here.
MT-SpamAssassin Download
Please leave me some decent comments so I can seed the database 😉

Tools for manipulating Images

Occasionally at work that we need to do some simple task that involves converting images or finding their sizes etc. The problems with tasks that are “Occasional” is that you can never remember the way you did it the last time.
What size if that jpeg, gif or png?
How can I resize that image?
You can’t be bothered firing up gimp or some other tool so what can you do……

@debian:$ identify truman.gif
truman1.gif GIF 258x333 258x333+0+0 PseudoClass 32c 24kb 0.000u 0:01

That was easy, wasn’t it. What if we needed much more info than this, well thats much harder we need to do the following:

@debian:$ identify -verbose truman.gif

The hard part is the extra typing. I will leave it to the reader to try that one (there is too much output for here).
What about those times when you just wish one of your images was half the size. Well here comes another great tool to the rescue

@debian:$ convert -sample 50%x50% truman.jpg truman_half.jpg

For those that are after a little bit more info on these handly little tools head on over to IBM developer works to get more information.
Even the article above only scratches the surface of what convert can do.

Spamassassin Plugin for Moveable Type

I asked on the Moveable Type support Forum if anyone would be interested in a plugin that uses SpamAssassin. There were no replies to the post so it looks like it is either longer such an issue in the blogging world or maybe its already been done and I have not found the link. Perhaps I posted it to the wrong forum 😉 I would have thought that there would have been some interest in it but I was mistaken.
I wrote the plugin on Saturday and it is almost finished except for the pretty GUI. The Bayesian filtering is also working on it and I have tested it by scripting a few thousand spam entries into it and seeing if it would start spotting them and it did.
Thanks to the pluggable nature of Movable Type the plugin sits quite unobtrusively in it. I was after a much simpler solution than Blacklist without the separate GUI and management facilities etc and I think I could achieve this.
I intend to keep working at it and eventually use it on this blog so if you would like to try it contact me.

Yahoo and Nutch

Its very true that you learn something new every day and today I learned that Yahoo are using Nutch in a research capacity.

Welcome to the Yahoo! Research Labs implementation of the Nutch open source search engine (www.nutch.org). This search engine is intended as a demonstration platform for a number of search related technologies

I found it purely by chance. If you don’t believe then have a look at Yahoo’s intall of Nutch. I think that its a smart move on their part because they get to see how it does its stuff and assess it. They may even be able to incorporate some of it into their own products.

Marketing a simple website

I have spent a fair bit of time working on another website that had some of the most horrible HTML I have ever seen. I managed to actually upload the site last night and it is now live. I didn’t design the site I just converted it to HTML Transitional that validates from some Dreamweaver mess.
I have already made a few entries about this in my blog so here’s the link.
Aerospace NDT
The people at Aerospace NDT realised they where not getting enough from their website so they contacted me to see if I could do something with it. I had a look at their site and wrote up what I thought of it and gave them some advice as to what I though could be done with it to improve its visibility etc. They seemed to like what I said because I got the job.
I am basically tasked with getting their site up the google ranks which I have already done and quite substantially. I was very lucky and they were unlucky in the fact that the single greatest change required to the site so far has been the removal of the splash screen. They were unlucky in this because their last developer had left them with a site that could not be seen by the search engine because there was not a single link off the splash screen. This also meant that in certain browsers without flash they could not actually see the websites.
I have made some fundamental changes to their site during the conversion from the old one so we should see an overall increase in the google ranks but time will tell. I am keeping a tally for certain search terms to make sure that what we do has a positive affect on the site so watch this space.

Open source tools for MARC LIbrary records

I am no librarian but today I got to put on my glasses and tell everyone to be quiet because I was investigating open source library systems. The first one I had to look at is
Koha

Koha is the world’s first free Open Source Library System. Made in New Zealand by the Horowhenua Library Trust and Katipo Communications Ltd, the Koha system is a full catalogue, opac, circulation, member management and acquisitions package. To our knowledge Koha is used by public libraries, private collectors, university faculties, not for profit organizations, churches, schools and corporates. People from as far afield as Australia, USA, Canada, Estonia, India, Nigeria and Poland have installed Koha.
Key features

This is apparently used by a lot of people and does MARC records searches etc etc.
The install was very swish (my idea of swish is not some flash GUI, a simple command line install is fine for me) which gave me the warm and fuzzies. It also came with some sample data which was nice. Different ports are used for different things which was a bit confusing because I initially went to the admin screen and was wondering where all the library data was meant to go when I discovered I needed to go to a different port number to actually use the library system.
I can’t say I was too impressed with the interface. First off, its not very intuitive. This might be because I am not a librarian and don’t really understand what all these funny numbers are for but I still couldn’t get used to the look and feel of it. I suppose this could be customized with a little css.
The other thing I tried was to load a Z39.50 MARC record into the database from one of the online servers. This failed miserably and gave some very cryptic pop up boxes telling me I had not filled in some mandatory fields. It took me 40 minutes to realize that there are some mandatory fields that are not marked as mandatory on another screen. On filling in this it still refused to work. On hunting around the logs I noticed that when you carried out a Z39.50 search the log would be hit every second or two until you closed the search window. I can only assume this is a bug because I cannot think why you would want to do it otherwise.
One thing in its favor is that its written in Perl so if we do decide to run with it I should be able to patch or add things to it that don’t work or that don’t suit our install. Tomorrow I am going to be looking at phpmylibrary which from what I have read of it is quite nice.

Continue reading “Open source tools for MARC LIbrary records”