Diligent Editing of HTML

I am a fan of standards ie XHTML Transitional/Strict etc. To this end I do try to make sure that I am keeping my own sites reasonably compliant. Sites I do commercially are always 100% compliant but thats because I insist on it and they have placed their trust in me.
Just recently I have had to convert a really bad site to XHTML Transitional and if you had seen the markup you would have realized how big this task was. To go through it by hand would have been an enormous task and quite frankly I would have been unable to do it at the price I quoted without the following tools:
1. Vim ( Braam Moolenaar )
2. Template Toolkit TT2 ( Andy Wardley )
3. HTML Tidy (Dave Ragget)
4. W3C Validator ( The W3C Validator Team )
The first tool (Vim) could really be any good text editor ie Emacs, ed, or any of the vi children. I just happen to use Vim and once you have learned the basics joy to use and makes editing text almost an art.
TT2! the second tool is slightly more specialized and less well known but just as easy to use, but it deserves a big mention. TT2 is a templating system. Most people won’t really understand or even need to know what the advantages of this is until they need to edit a 10+ page website and hate it when someone wants to change a font on some item on all the pages. This could of course be done using server side includes or some other method but TT makes this easy but also exposes a programmatic API which make its functionality and versatility as wide as the programmers skills. This only scratches the surface of what TT can actually do for you.
The third tool is Dave Raggets HTML Tidy. This one tools is what saved me from going stark raving mad this weekend. Visually selecting an area in vim and then
‘<,’>!tidy -asxhtml -icbq -wrap 100
was what kept me sane. This single command will take ANY html fragment and sanitize it for you. It adds a lot of guff that you may not want but you can remove that and you have a sanitized version complete with CSS.
I just wanted the formatting, indenting and validation. I weeded out the CSS and I was left with a nice plain HTML document that I was then able to understand rather than some debauchery of a mess the devil would not have started with.
Using Tidy this way is a great way to get a clear place to start when converting a messy HTML page.
Last but not least is the W3C’s validator pages for both CSS and XHTML. After all the grunt work is over its time to check the pages and using the methods above I managed to come in with:
Out of 29 Pages:
20 html errors
2 css errors
this took me about 30 minutes to fix!

Web Standards CSS and HTML

Before I start I just need to say that I am an advocate of standards but I am not a religious bigot about them.
My problem has never been the technicalities of CSS or HTML, lets face it if you have figured out Perl to a reasonable degree and know your way around half a dozen programming languages then adding some markup etc to your toolbox is hardly going to kill you. Or at least you wouldn’t think so?
Unlike programming languages and their platform specific problems, browsers display your work as they see fit. The user at the other end may see a work of art or a bloody debauchery and judge you and your ability accordingly.
The browser doesn’t even have the common decency to open a window informing the user that it’s made some complete horlicks in rendering our page, nor should it! It’s our problem! if we want people to read the stuff the least we can do is make it as legible as possible for them, its not their fault the browser has a few quirks.
The internet, for all intents and purposes is a new method of communication and like all previous methods of communication it comes with its fair share of problems ie its new, its evolving, it hasn’t reached any form of maturity yet. As an analogy:
Have you ever phoned someone and had a “BAD LINE”. This prompts you into very articulated speech were all the words are formed “just so” and you might even talk a bit louder. Mean while, in the back of your mind you are cursing the weather or your blaming the old exchange down the road for the bad reception.
Without even realising it you will be compensating for the deficiencies of the system you are using with your loud speech and long drawn out words. You might not like it but its an accepted part of the system and out of your control so you just get on with it and do your best. Web Development (Development in general) is like this.
I am all up for standards but I am not one to take a pop at something because it has not been done “Just So”. Implementing standards is a process and there is no need get all religious about it.