Forum Home
Press F1

Thread ID: 113561	2010-10-25 23:14:00	need program or script, that can fix html layout, and bulk update html tags to xhtml	Morgenmuffel (187)	Press F1

Post ID	Timestamp	Content	User
1147643	2010-10-25 23:14:00	Hi all Problem 1) I regularly have to deal with html files that have been generated and may have missing or incorrectly nested tags and often the entire site is generated in one line, What i need is some function or program that will just lay the code out, indented etc, so its readable (but not alter any tags as i need to find the errors to track back). Currently i use a few simple find and replaces eg find </tr><tr> replace with </tr>\n<tr> (with use regex ticked) to get readability, but surely something this simple must be automated somewhere Problem 2) I also have a large site that has a huge variety of tags some html some xhtml I just want to change them all to xhtml again find and replace works fine for <br> to <br /> but trying to convert image tags and input tags is way more difficult, especially finding a tool that alters bulk pages at once any suggestions would be appreciated	Morgenmuffel (187)
1147644	2010-10-25 23:36:00	I feel your pain... I have tried a few different methods, including Dreamweavers cleanup functions. Most introduce more problems I have found. I usually end up resorting to doing it by hand and using find/replace. It's the only method that works 100% :p	SoniKalien (792)
1147645	2010-10-25 23:39:00	I use to use a online html validation site (can't recall the name) for my sites/blogs, and it would show the necessary corrections required. I also did tutorials for html/xhtml by html expert Jennifer Kyrin at About Com. Might have a solution, e.g. html to xhtml conversions. (webdesign.about.com)	kahawai chaser (3545)
1147646	2010-10-26 04:39:00	Problem 1)Try Tidy (en.wikipedia.org) - in my experience it does a great job, and can fix almost anything you throw at it. Problem 2)Again, Tidy is the go-to tool for this job, it'll make light work of the task. You could also use a regex search / replace tool, although this will take you a fair bit longer than Tidy will.	Erayd (23)
1147647	2010-10-27 04:27:00	Can tidy do a fix as a batch process? and does it have pretty interface somewhere as last time i checked it was command line and i ended up utterly lost also, and on a different tack is there a tool that can scan sections of a live site and flag those pages that have missing tags, specifically div tags, as this site seems to have an abundance of <div> tags but they don't seem to have bothered with </div> tags as much, i guess when the site was table based it wasn't an issue, but now I am trying to move it towards css, it's playing havoc or alternately i can scan the pages on my pc, which is probably better	Morgenmuffel (187)
1147648	2010-10-27 04:39:00	There's the web developers toolbar (www.snapfiles.com), that I used a while back for live sites, and a add-on for Firefox. Don't know about current version.	kahawai chaser (3545)
1147649	2010-10-27 05:42:00	Can tidy do a fix as a batch process?If you script it, yes. If I'm doing batch repairs on a ton of files I usually pipe the output of find into a tidy loop - something like this would batch-convert all html files in or below the current directory into valid xhtml: find . -type f -name ".html" \| while read f; do tidy -q -m -e -asxml "$f" done Edit: If you only need to run one command on the file (i.e. tidy), the above can be expressed more succinctly as: find . -type f -name ".html" -exec tidy -q -m -e -asxml {} \; ...and does it have pretty interface somewhere as last time i checked it was command line and i ended up utterly lostTidy itself it a CLI program, but there are several GUI frontends available for it - a few are listed on Tidy's SF project page (http://tidy.sourceforge.net/). ...is there a tool that can scan sections... and flag those pages that have missing tags, specifically div tags... on my pc, which is probably betterTidy can fix this for you, just point it at the offending files. If you're happy to have a lot of stuff flagged (not just missing </div> tags), the W3 Markup Validator (http://validator.w3.org/) will do this for you. I don't know of any tool that will only flag for missing </div> tags, but if you really need this rather than the in-place fix that tidy provides or W3's error flagging, let me know and I'll write one for you.	Erayd (23)
1147650	2010-10-27 07:09:00	can it replace font tags consistently? What i mean is from testing with a few front ends it will replace <font size="1">Hello darkness my old friend</font> <font size="3">Watch out where the Huskies go</font> with span . c2 {font-size: 80%} span . c1 {font-size: 70%} <span class="c1">Hello darkness my old friend</span> <span class="c2">Watch out where the Huskies go</span> But on another page that has say <font size="7">Yellow Matter Custard</font> <font size="1">Hello darkness my old friend</font> <font size="3">Watch out where the Huskies go</font> when I process it, it comes up span . c3 {font-size: 80%} span . c2 {font-size: 70%} span . c1 {font-size: larger} <span class="c1">Yellow Matter Custard</span> <span class="c2">Hello darkness my old friend</span> <span class="c3">Watch out where the Huskies go</span> Now i want to link it to an external css, but if the styles for each page are different, then I am stuffed . Is there a way for me to tell it that it needs to be consistent across pages? edit-------------------------- I haven't tried batch processing yet	Morgenmuffel (187)
1147651	2010-10-27 07:26:00	My understanding is that it will name styles in the order they are required. I'm not sure whether this will work, but have you tried specifying multiple input files for a single run of Tidy? Something like this: TIDYCMD="tidy -q -m -e -asxml" find . -type f -name "*.html" \| while read f; do TIDYCMD="$TIDYCMD \"$f\"" done $TIDYCMD	Erayd (23)
1147652	2010-10-28 02:05:00	I tried batching it, but it seems to work in a per document basis. So in the end i just did a find and replace on each tag	Morgenmuffel (187)
1