I'm working on a function to strip HTML, but preserve things like paragraph spacing. In removing HTML tags, sometimes you end up with lots of blank lines... Here's a quick regular expression to convert multiple blank lines with just one \n
character:
<cfset content = ReReplace(content, "[\r\n]+", "#Chr(10)#", "ALL")>
Comments
Will for that, try something like this as the pattern: [\r\n]+\t*[\r\n]*
That will only match one line with tabs at a time. so \n\t\n\t\n\t will be converted to \n\t\n, not just \n. You'll need [\r\n][\r\n\t]*[\r\n] or something like it.
By the way, I have a handy tool that you can use to play with Regex: http://www.petefreitag.com/tools/find_replace/
that's great! been looking for a function/cfc that does a good job of stripping out the HTML from form posts (before they get INSERT'd into the db)... preferably one with a list of optional "okay" HTML tags that can be passed in... never been good with RegEx, though.
Domain for sale quality results found here