Regex to Replace Multiple Blank Lines with One

Published on May 19, 2005
By Pete Freitag

I'm working on a function to strip HTML, but preserve things like paragraph spacing. In removing HTML tags, sometimes you end up with lots of blank lines... Here's a quick regular expression to convert multiple blank lines with just one \n character:

<cfset content = ReReplace(content, "[\r\n]+", "#Chr(10)#", "ALL")>

Will for that, try something like this as the pattern:

by Pete Freitag on 05/19/2005 at 1:54:56 PM UTC
That will only match one line with tabs at a time. so \n\t\n\t\n\t will be converted to \n\t\n, not just \n. You'll need [\r\n][\r\n\t]*[\r\n] or something like it.
by Barney on 05/19/2005 at 2:57:32 PM UTC
By the way, I have a handy tool that you can use to play with Regex:
by Pete Freitag on 05/19/2005 at 3:02:20 PM UTC
that's great! been looking for a function/cfc that does a good job of stripping out the HTML from form posts (before they get INSERT'd into the db)... preferably one with a list of optional "okay" HTML tags that can be passed in...

never been good with RegEx, though.
by forgetfoo on 05/20/2005 at 12:45:12 PM UTC
