Forum Home
Press F1

Thread ID: 109215	2010-04-29 01:59:00	Regular Expression help - only preserve contents of Anchor tags	Morgenmuffel (187)	Press F1

Post ID	Timestamp	Content	User
880861	2010-04-29 01:59:00	Hi all I am currently trying to get info out of a former frontpage site which is a mess to put it bluntly Basically all i want out of the pages is the anchor links like below <a href="/files/sopwith/camel.htm">Biggles and Algie</a> While finding them should be easy the sheer amount of extraneous tags is making the going painful, But i can't get my regular expressions working	Morgenmuffel (187)
880862	2010-04-29 02:09:00	This works to find the links, but what i want it to do is remove everything else, and i am blowed if i can figure it out, I also xan't get the below code to work in notepad++, but it works in an elderly version of dreamweaver <a\b[^>]>(.?)</a>	Morgenmuffel (187)
880863	2010-04-29 02:40:00	I take that back the above code is only finding some links and not all as it isn't finding any that have line breaks in them eg <a href="/files/sopwith/camel.htm">Biggles and Algie </a> dammit my brain is now officially hurting	Morgenmuffel (187)
880864	2010-04-29 03:17:00	Eureka-ish <a\b[^>]*>([\s\S]+?)</a> probably not the most elegant, and i still can't work out how to get rid of all the other text on the page, or pipe the result into a new file on windows	Morgenmuffel (187)
1