A Regular Expression failing on the borders of words

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

A Regular Expression failing on the borders of words

Luciano de Souza
Hello all,

This program compiles, but it gives a wrong output.

program ertext;
{$mode objfpc}{$H+}

uses
Sysutils, regexpr;

var
r: TRegexpr;
i: integer;

BEGIN
r := TRegexpr.create;
try
r.expression := '^(x\s+)*(\([A-E]\))*(\s*.*\.)+(\s+\+.*\b)*(\s+@.*\b)*$';
if r.exec('x (A) Write a report. +ABC +DEF @John @Mary') then
begin
for i := 1 to r.SubexprMatchCount do
begin
writeln(r.match[i]);
readln;
end;
end;

finally
r.free;
end;
END.

The input string is "x (A) Write a report. +ABC +DEF @John @Mary".

The expected output was:

r.match[1] -> " x" -> OK
r.match[2] -> "(A)" -> OK
r.match[3] -> "Write a report." -> OK
r.match[4] -> " +ABC +DEF" -> Error
r.match[5] -> " @John @Mary" -> Error

In stead of 4 e 5, I got only 4 with the following output: " +ABC +DEF
@John @Mary".

It's true I am not have enough experience in Regular Expressions, but
the used one seems to be very logical for me. Would Someone have any
idea what is wrong?

Best regards,

--
Luciano de Souza
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: A Regular Expression failing on the borders of words

Mattias Gaertner
On Thu, 10 Apr 2014 02:43:41 -0300
luciano de souza <[hidden email]> wrote:

>[...]
> r.expression := '^(x\s+)*(\([A-E]\))*(\s*.*\.)+(\s+\+.*\b)*(\s+@.*\b)*$';
> if r.exec('x (A) Write a report. +ABC +DEF @John @Mary') then

The (\s+\+.*\b) matches '+ABC +DEF @John @Mary'.

Mattias
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: A Regular Expression failing on the borders of words

Graeme Geldenhuys-6
In reply to this post by Luciano de Souza

 When in doubt, I always recommend "Explain Regex" - there are many
such online services. Here is one.

http://rick.measham.id.au/paste/explain.pl?regex=%5Cd%2B(.%5Cd%2B)%3F(%5BeE%5D%5Cd%2B)%3F

Simply paste your regex in the box and click the button. It will then
analyse and explain what your regex will do and match.

If still in doubt, I recommend you take a look at RegexMagic
[http://www.regexmagic.com/] or RegexBuddy
[http://www.regexbuddy.com/]. These products make it super easy to
create working regex statements. Even the 30 day trials will help.


Regards,
  - Graeme -

--
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/


On Thursday 10/04/2014 at 06:44, luciano de souza  wrote:

> Hello all,
>
> This program compiles, but it gives a wrong output.
>
> program ertext;
> {$mode objfpc}{$H+}
>
> uses
> Sysutils, regexpr;
>
> var
> r: TRegexpr;
> i: integer;
>
> BEGIN
> r := TRegexpr.create;
> try
> r.expression :=
> '^(x\s+)*(\([A-E]\))*(\s*.*\.)+(\s+\+.*\b)*(\s+@.*\b)*$';
> if r.exec('x (A) Write a report. +ABC +DEF @John @Mary') then
> begin
> for i := 1 to r.SubexprMatchCount do
> begin
> writeln(r.match[i]);
> readln;
> end;
> end;
>
> finally
> r.free;
> end;
> END.
>
> The input string is "x (A) Write a report. +ABC +DEF @John @Mary".
>
> The expected output was:
>
> r.match[1] -> " x" -> OK
> r.match[2] -> "(A)" -> OK
> r.match[3] -> "Write a report." -> OK
> r.match[4] -> " +ABC +DEF" -> Error
> r.match[5] -> " @John @Mary" -> Error
>
> In stead of 4 e 5, I got only 4 with the following output: " +ABC +DEF
> @John @Mary".
>
> It's true I am not have enough experience in Regular Expressions, but
> the used one seems to be very logical for me. Would Someone have any
> idea what is wrong?
>
> Best regards,
>
> --
> Luciano de Souza
> _______________________________________________
> fpc-pascal maillist  -  [hidden email]
> http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal



_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: A Regular Expression failing on the borders of words

Luciano de Souza
In reply to this post by Mattias Gaertner
Well, I need to study more. I can't understand what is wrong. The
match starts with a blank space: \s+. Some string follows: .*. And it
ends in the border of word: \b. So, (\s+.*\b)*, for me, should return:
+ABC +DEF. If "@" is not present, why the rest of the string was
matched.

My knowledge about Regular Expressions is small, so I really need to study more.

2014-04-10 4:16 GMT-03:00, Mattias Gaertner <[hidden email]>:

> On Thu, 10 Apr 2014 02:43:41 -0300
> luciano de souza <[hidden email]> wrote:
>
>>[...]
>> r.expression := '^(x\s+)*(\([A-E]\))*(\s*.*\.)+(\s+\+.*\b)*(\s+@.*\b)*$';
>> if r.exec('x (A) Write a report. +ABC +DEF @John @Mary') then
>
> The (\s+\+.*\b) matches '+ABC +DEF @John @Mary'.
>
> Mattias
> _______________________________________________
> fpc-pascal maillist  -  [hidden email]
> http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
>


--
Luciano de Souza
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: A Regular Expression failing on the borders of words

Mattias Gaertner
On Thu, 10 Apr 2014 10:05:44 -0300
luciano de souza <[hidden email]> wrote:

> Well, I need to study more. I can't understand what is wrong. The
> match starts with a blank space: \s+. Some string follows: .*. And it
> ends in the border of word: \b. So, (\s+.*\b)*, for me, should return:
> +ABC +DEF. If "@" is not present, why the rest of the string was
> matched.

The last (\s+@.*\b)* can be ignored, because of ()*.
The .* matches anything.
The \b matches any word border, for example the position behind 'Mary'.
So the (\s+\+.*\b) matches '+ABC +DEF @John @Mary'.

> [...]
> >> r.expression := '^(x\s+)*(\([A-E]\))*(\s*.*\.)+(\s+\+.*\b)*(\s+@.*\b)*$';
> >> if r.exec('x (A) Write a report. +ABC +DEF @John @Mary') then

Mattias
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal