Forum Home
Press F1
 
Thread ID: 77697 2007-03-19 02:41:00 regular expressions - how to use them for searching Morgenmuffel (187) Press F1
Post ID Timestamp Content User
534239 2007-03-19 02:41:00 Hi all

The issue i have is that I have around 3 - 5000 pages that i need to remove the dreamweaver template markup from, rather stupidly you can't do this easily in dreamweaver (you have to do it on a page by page basis, no batch process!!), and the Boss isn't willing to pay me to take that long

Now there are about 16 different templates used throughout the site (that i found so far probably more), and i found what looked to be the perfect solution (thedesignspace.net)

Unfortunately my mac is currently in storage (Autumn Cleaning house), and i cannot get the ( grep - regular expression) strings they show in the above example to work in any windows text editors I have apart from of course Dreamweaver which blocks me from deleting the markup

If worst comes to worst I can dig out the mac and use it, but that will take a few hours and i would prefer to do this in Windows, as this looks to be a job I will be doing a lot of (this is only one small section of the section)

Cheers


Template code is the bits that contain "instance"


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "www.w3.org
<html xmlns="www.w3.org lang="en" xml:lang="en">
<!-- InstanceBegin template="/Templates/nigel-test.dwt" codeOutsideHTMLIsLocked="false" -->
<head>
<!-- InstanceBeginEditable name="doc-head" -->
<title>the title bit</title>
<!-- InstanceEndEditable -->
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<link rel="stylesheet" type="text/css" index.css" />
</head>

<body>
<!-- InstanceBeginEditable name="content" -->
content bla bla
<!-- InstanceEndEditable -->
</body>
Morgenmuffel (187)
534240 2007-03-19 07:36:00 Just a quick stab at this but: have you tried using homesite? It comes with dreamwevaer (mine did anyway). Also, Visual Studio Express may be able to do this, it's free to download. eldarcolonel (7392)
534241 2007-03-19 09:30:00 This ruby script will strip out all HTML comments from your code (including Dreamweaver instructions). Download the one click ruby installer for Windows here: rubyforge.org install it then copy this code:


require 'ftools'

while filename = ARGV.shift
content = File.read(filename)
content.gsub!(/<!--.*?-->/m,'')
File.move filename, "#{filename}.bkp"
File.open(filename, "w") {|f| f.write content}
end

Save it as, for example "strip.rb" and run it like this:

ruby strip.rb index.html about.html file3.html

All files will be backed up to a .bkp copy and the original will be replaced with a version devoid of HTML comments.

P.S. You need to run it at the command line. You can't easily provide parameters to a script from the Windows GUI :)
TGoddard (7263)
534242 2007-03-20 03:48:00 Or just make a shortcut to 'ruby strip.rb', and then drag all the files you want to alter onto the shortcut... Erayd (23)
534243 2007-03-21 02:24:00 An update

I have downloaded a trial version of powergrep and it seems to find the template markup fairly easily so i should be able to eliminate it just as easily Once i finish downloading the site, I'm probably about 30% of the way through the download at the minute and have 1200 files in 220 folders so far (I'm just glad i don't have to download the associated images as well), and i have just been given another section of the site (slightly smaller to do)

Anyway thanks for all the help, if powergrep hadn't worked i would be using the ruby script
Morgenmuffel (187)
1