Remove blank lines when there are more than one of them
Moderators: Dorian (MJT support), JRL
Remove blank lines when there are more than one of them
Hello,
I would like a way to remove multiple blank lines between text in a way so that only one blank line is left. The number of blank lines is irregular, might be 1 sometimes and no change is needed, might be several and I need to reduce it to 1. I have found many ways to remove all blank lines, but not astute enough to figure out how to spare 1.
For example, turn this:
Paragraph 1
blank line 1
blank line 2
blank line 3
Paragraph 2
into this:
Paragraph 1
blank line 1
Paragraph 2
without affecting this:
Paragraph 3
blank line 1
Paragraph 4
Thanks for any advice! MS saves my life.
I would like a way to remove multiple blank lines between text in a way so that only one blank line is left. The number of blank lines is irregular, might be 1 sometimes and no change is needed, might be several and I need to reduce it to 1. I have found many ways to remove all blank lines, but not astute enough to figure out how to spare 1.
For example, turn this:
Paragraph 1
blank line 1
blank line 2
blank line 3
Paragraph 2
into this:
Paragraph 1
blank line 1
Paragraph 2
without affecting this:
Paragraph 3
blank line 1
Paragraph 4
Thanks for any advice! MS saves my life.
-
- Newbie
- Posts: 19
- Joined: Mon Oct 14, 2019 6:23 am
Re: Remove blank lines when there are more than one of them
This might not be the best way, but should do what you're after:
Code: Select all
//Variable to remove extra lines from
Let>TEST=Paragraph1%CRLF%%CRLF%%CRLF%%CRLF%Paragraph2%CRLF%Paragraph3.
//Separate using CRLF
Separate>TEST,%CRLF%,TEST_ARR
//Result will be the new variable without extra breaks - concat the first break
Let>Result=TEST_ARR_1
ConCat>Result,%CRLF%
//Loop to go through the array and concat everything together without extra breaks (x=1 instead of 0 as we've already set the first row)
Let>x=1
Repeat>x
//increase the loop
Add>x,1
//If the current array value is blank (%CRLF%) then no action, don't concat this row
If>TEST_ARR_%x%={""}
Goto>Endx
ELSE
//If there is some sort of value, then concat that row and add a break afterwards
ConCat>Result,TEST_ARR_%x%
ConCat>Result,%CRLF%
EndIf>
Label>Endx
Until>x=Test_ARR_COUNT
//Check that it worked
MDL>Result
Re: Remove blank lines when there are more than one of them
Hi, just an example how to solve it using Regex>
You treat the text as one line and look for two or more consecutive end-of-line-characters (\R in regex language). If found replace with just two of them - %CRLF%.
You treat the text as one line and look for two or more consecutive end-of-line-characters (\R in regex language). If found replace with just two of them - %CRLF%.
Code: Select all
LabelToVar>Text,strText
Let>tmp0=(?s)(\R){2,}
RegEx>tmp0,strText,0,m,nm,1,%CRLF%%CRLF%,strRes
MDL>strText
MDL>strRes
/*
Text:
Paragraph 1
Paragraph 2
Paragraph 3
Paragraph4
Paragraph5
Paragraph6
*/
Re: Remove blank lines when there are more than one of them
Thanks, these did the trick. I have started learning Regex and was wondering which exact version that MS uses? The syntax seems to vary across different tutorials.
- Dorian (MJT support)
- Automation Wizard
- Posts: 1390
- Joined: Sun Nov 03, 2002 3:19 am
- Contact:
Re: Remove blank lines when there are more than one of them
Yes, we have a Custom Scripting Service. Message me or go here
Re: Remove blank lines when there are more than one of them
One last semi-related question,
I am having some serious problems with simple regexes working in RegexBuddy and with various online regex checkers, yet they fail or give different results in MS.
For example, if my text is the following and I just wish to match the blocks of text starting with POSITIVE, ^POS.*$ just matches all of the text in MS, whereas Regexbuddy just matches the lines I want (e.g. everything from POSITIVE to Willow, POSITIVE to Fumigatus, excludes the words POLLENS, MOLDS, NEGATIVE...Alternaria.
The text example:
POLLENS
POSITIVE to: Ragweed, Burweed Marsh Elder, Cocklebur, Golden Rod, Kochia, Lambs Quarter, Mugwort, Nettle, Pigweed, Plantain, Russian Thistle, Sheep Sorrel, Yellow Dock, Brome Grass, Grass, Johnson Grass, June Grass, Meadow Fescue, Red Top, Orchard Grass, Rye Grass, Sweet Vernal, Ash, Timothy Grass, Birch, Black Walnut (pollen), Cottonwood, Elm, Hickory, Mulberry, Maple, Poplar, Oak, Privet, Red Cedar, Sycamore, White Pine, Other, Willow
MOLDS
POSITIVE to: Stemphylium, Aureobasidium, Bipolaris, Gibberella, Epicoccum, Sarocladium, Penicillium, Mucor, Cladosporium , Botrytis, Alternaria, Aspergillus Fumigatus
NEGATIVE to: Stemphylium, Aureobasidium, Bipolaris, Gibberella, Sarocladium, Penicillium, Epicoccum, Aspergillus Fumigatus, Mucor, Cladosporium , Botrytis, Alternaria
====
Is there a Regex guide for use in MS other than the command reference page?
Thanks!
I am having some serious problems with simple regexes working in RegexBuddy and with various online regex checkers, yet they fail or give different results in MS.
For example, if my text is the following and I just wish to match the blocks of text starting with POSITIVE, ^POS.*$ just matches all of the text in MS, whereas Regexbuddy just matches the lines I want (e.g. everything from POSITIVE to Willow, POSITIVE to Fumigatus, excludes the words POLLENS, MOLDS, NEGATIVE...Alternaria.
The text example:
POLLENS
POSITIVE to: Ragweed, Burweed Marsh Elder, Cocklebur, Golden Rod, Kochia, Lambs Quarter, Mugwort, Nettle, Pigweed, Plantain, Russian Thistle, Sheep Sorrel, Yellow Dock, Brome Grass, Grass, Johnson Grass, June Grass, Meadow Fescue, Red Top, Orchard Grass, Rye Grass, Sweet Vernal, Ash, Timothy Grass, Birch, Black Walnut (pollen), Cottonwood, Elm, Hickory, Mulberry, Maple, Poplar, Oak, Privet, Red Cedar, Sycamore, White Pine, Other, Willow
MOLDS
POSITIVE to: Stemphylium, Aureobasidium, Bipolaris, Gibberella, Epicoccum, Sarocladium, Penicillium, Mucor, Cladosporium , Botrytis, Alternaria, Aspergillus Fumigatus
NEGATIVE to: Stemphylium, Aureobasidium, Bipolaris, Gibberella, Sarocladium, Penicillium, Epicoccum, Aspergillus Fumigatus, Mucor, Cladosporium , Botrytis, Alternaria
====
Is there a Regex guide for use in MS other than the command reference page?
Thanks!
-
- Newbie
- Posts: 19
- Joined: Mon Oct 14, 2019 6:23 am
Re: Remove blank lines when there are more than one of them
In my experience for Regex you need to do a lot of testing, what returns results in some engines doesn't always seem to in Macro Scheduler.
I use Regex101 for all of my testing, that engine by default has multiline and global mode enabled by default.
I've had issues with forgetting to enable multiline (?m) in MS which had a mismatch of results when using ^$.
Not using a greedy quantifier has also had some mixed results, sometimes adding a ? after something finds the result. Generally always a good idea after something like .* which is followed by \n. (.*?\n rather than .*\n).
Regex can be tricky, so usually when I encounter something that's weird I put it down to I've probably made a mistake and try another method to achieve the same result.
I use Regex101 for all of my testing, that engine by default has multiline and global mode enabled by default.
I've had issues with forgetting to enable multiline (?m) in MS which had a mismatch of results when using ^$.
Not using a greedy quantifier has also had some mixed results, sometimes adding a ? after something finds the result. Generally always a good idea after something like .* which is followed by \n. (.*?\n rather than .*\n).
Regex can be tricky, so usually when I encounter something that's weird I put it down to I've probably made a mistake and try another method to achieve the same result.
- Dorian (MJT support)
- Automation Wizard
- Posts: 1390
- Joined: Sun Nov 03, 2002 3:19 am
- Contact:
Re: Remove blank lines when there are more than one of them
Just in case you don't manage to master Regex, here's a way to do it using only Macro Scheduler.
This method reads an entire text file, uses Separate, and then uses Position to check each paragraph to see if it contains "POSITIVE to".
Of course you could use ReadLn and read it line-by-line, negating the need to use Separate, but this may be slower on very large files.
As long as all your text is in the variable TheText, it doesn't matter how it gets there. So if it was all in the clipboard, for instance, you could replace ReadFile or ReadLn with GetClipBoard>TheText
This method reads an entire text file, uses Separate, and then uses Position to check each paragraph to see if it contains "POSITIVE to".
Of course you could use ReadLn and read it line-by-line, negating the need to use Separate, but this may be slower on very large files.
As long as all your text is in the variable TheText, it doesn't matter how it gets there. So if it was all in the clipboard, for instance, you could replace ReadFile or ReadLn with GetClipBoard>TheText
Code: Select all
ReadFile>d:\pollens.txt,TheText
Separate>TheText,CRLF,TheParagraphs
let>k=0
Repeat>k
Let>k=k+1
Pos>POSITIVE to,TheParagraphs_%k%,1,PosPOSTO,
If>posPOSTO>0
MessageModal>TheParagraphs_%k%
Endif
Until>k,TheParagraphs_count
Label>end
Yes, we have a Custom Scripting Service. Message me or go here
Re: Remove blank lines when there are more than one of them
From the the Manual:
RegEx is compatible with the Perl 5.10 regular expression syntax using the PCRE library.
Re: Remove blank lines when there are more than one of them
Sometimes I get differences using MS and RegexBuddy, but usually it relates to difference in modifier settings. Top left in RegexBuddy you have different choices for eg, case sensitivity, spacing, dot matches line breaks, etc.
In MS, if unsure, it can help to include the modifier in the search pattern so you know how it will behave.
In the given example, you can add the modifier in the beginning.
(?m-s)^POS.*$
m (m turned on) means match ^ and $ on every line
-s (s turned off) means the DOT does not match line break
Then ^ and $ will match on every line but never passed a line break.
Then as mentioned greedy/non-greedy can create problems if one is not careful.
In MS, if unsure, it can help to include the modifier in the search pattern so you know how it will behave.
In the given example, you can add the modifier in the beginning.
(?m-s)^POS.*$
m (m turned on) means match ^ and $ on every line
-s (s turned off) means the DOT does not match line break
Then ^ and $ will match on every line but never passed a line break.
Then as mentioned greedy/non-greedy can create problems if one is not careful.
Re: Remove blank lines when there are more than one of them
@ hagchr,
I agree. I find this to work well between RegexBuddy and Macro Scheduler.
[edit]- I just noticed you turn off 's'. I'll have to try that.
regex>(?Usmi)pattern,text,,match,nom,0
i= ignore case
m= ^ and $ match start and end of line
s= . matches newline as well
x= Allow spaces and comments
J= Duplicate group names allowed
U= Ungreedy quantifiers
?=set flags.
@ari,
Question mark is used in multiple ways. In this case it is to indicate you are setting flags.
I included some extras. But I find (?Usmi) works well.
"pattern" is what ever pattern you've found that works in RegExBuddy and so on. Just include (?Usmi)
Also, check these out. I captured these and use them when I need to.
https://www.cheatography.com/davechild/ ... pressions/
Ok. Well, there's one. I can't find the other I have. But just search "regex cheat sheet". There's plenty out there.
I use MS capture and capture the cheats and then place them all in one image, similar to how they are online.
PepsiHog
I agree. I find this to work well between RegexBuddy and Macro Scheduler.
[edit]- I just noticed you turn off 's'. I'll have to try that.
regex>(?Usmi)pattern,text,,match,nom,0
i= ignore case
m= ^ and $ match start and end of line
s= . matches newline as well
x= Allow spaces and comments
J= Duplicate group names allowed
U= Ungreedy quantifiers
?=set flags.
@ari,
Question mark is used in multiple ways. In this case it is to indicate you are setting flags.
I included some extras. But I find (?Usmi) works well.
"pattern" is what ever pattern you've found that works in RegExBuddy and so on. Just include (?Usmi)
Also, check these out. I captured these and use them when I need to.
https://www.cheatography.com/davechild/ ... pressions/
Ok. Well, there's one. I can't find the other I have. But just search "regex cheat sheet". There's plenty out there.
I use MS capture and capture the cheats and then place them all in one image, similar to how they are online.
PepsiHog
Windows 7
PepsiHog. Yep! I drink LOTS of Pepsi (still..in 2024) AND enjoy programming. (That's my little piece of heaven!)
The immensity of the scope of possibilities within Macro Scheduler pushes the user beyond just macros!
PepsiHog. Yep! I drink LOTS of Pepsi (still..in 2024) AND enjoy programming. (That's my little piece of heaven!)
The immensity of the scope of possibilities within Macro Scheduler pushes the user beyond just macros!