StrUtils.RomanToInt oddities

classic Classic list List threaded Threaded
48 messages Options
123
Reply | Threaded
Open this post in threaded view
|

StrUtils.RomanToInt oddities

Bart-48
Hi,

RomanToInt acceps rather ludicrous values:

RomanToInt('MDCLXVIVXLDM') = 2209
RomanToInt('IIIIM') = 1002  //calculated as 3 + (1000-1)

Both examples represent invalid roman numbers by any standard.
Also I do not think Roman numerals can be negative...

Feature or bug?

Bart
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: StrUtils.RomanToInt oddities

Reinier Olislagers
On 18/09/2013 16:19, Bart wrote:

> RomanToInt acceps rather ludicrous values:
>
> RomanToInt('MDCLXVIVXLDM') = 2209
> RomanToInt('IIIIM') = 1002  //calculated as 3 + (1000-1)
>
> Both examples represent invalid roman numbers by any standard.
> Also I do not think Roman numerals can be negative...
>
> Feature or bug?
>

1. I'll respond with the expected answer: "What does Delphi do" (if it
has this function?
2. I wouldn't be bothered by negative numbers... even the Romans used to
have negative posessions (= debts) and they didn't have double entry
bookkeeping yet ;) Also, the documentation [1] mentions it as a feature.
3. The rest... well, depending on 1. I'd agree it's a bug.
[1] mentions "Invalid characters are dropped from S"... which I think is
a bit strange (S is a const) but assuming it means "Invalid characters
in S are ignored" it means somebody thought checking the input for valid
Roman numbers was a good idea. The question however becomes "what is the
algorithm for deciding invalid characters" which IMO will become a mess
very quickly. Much better to just consider the entire input as invalid.

[1] http://lazarus-ccr.sourceforge.net/docs/rtl/strutils/romantoint.html
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: StrUtils.RomanToInt oddities

Bart-48
On 9/20/13, Reinier Olislagers <[hidden email]> wrote:

> 1. I'll respond with the expected answer: "What does Delphi do" (if it
> has this function?

D7 PE does not have it.

> 2. I wouldn't be bothered by negative numbers... even the Romans used to
> have negative posessions (= debts) and they didn't have double entry
> bookkeeping yet ;) Also, the documentation [1] mentions it as a feature.

If I mention in documentationn that result is unpredicted for any
input e.g. for function StrToInt(S), because ATMI implemented it like
Result := StringOfChar(Char(Random(256)),Random(256)), then this would
also not considered to be a bug, just it because does what the docs
say?

> 3. The rest... well, depending on 1. I'd agree it's a bug.
> [1] mentions "Invalid characters are dropped from S"... which I think is
> a bit strange (S is a const) but assuming it means "Invalid characters
> in S are ignored" it means somebody thought checking the input for valid
> Roman numbers was a good idea. The question however becomes "what is the
> algorithm for deciding invalid characters" which IMO will become a mess
> very quickly. Much better to just consider the entire input as invalid.

Any invalid char in S will result in Result being 0;
So doc is wrong about that.

Bart
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: StrUtils.RomanToInt oddities

Bart-48
On 9/20/13, Bart <[hidden email]> wrote:

> Any invalid char in S will result in Result being 0;
> So doc is wrong about that.

Reported that in bugtracker: http://bugs.freepascal.org/view.php?id=25061

Bart
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: StrUtils.RomanToInt oddities

Reinier Olislagers
In reply to this post by Bart-48
On 20/09/2013 18:33, Bart wrote:
> On 9/20/13, Reinier Olislagers <[hidden email]> wrote:
>> 2. I wouldn't be bothered by negative numbers... even the Romans used to
>> have negative posessions (= debts) and they didn't have double entry
>> bookkeeping yet ;) Also, the documentation [1] mentions it as a feature.
>
> If I mention in documentationn that result is unpredicted for any
> input [snip irrelevant part] then this would
> also not considered to be a bug, just it because does what the docs
> say?

Well, if you want a black and white answer: yes, because the
implementation follows the documentation/specifcations.

However, as I indicated in the rest of my post, I do think the
implementation(+documentation) needs to be changed for other cases. I
just don't see that many problems by allowing negative numbers.

>> 3. The rest... well, depending on 1. I'd agree it's a bug.
>> [1] mentions "Invalid characters are dropped from S"... which I think is
>> a bit strange (S is a const) but assuming it means "Invalid characters
>> in S are ignored" it means somebody thought checking the input for valid
>> Roman numbers was a good idea. The question however becomes "what is the
>> algorithm for deciding invalid characters" which IMO will become a mess
>> very quickly. Much better to just consider the entire input as invalid.
>
> Any invalid char in S will result in Result being 0;
> So doc is wrong about that.

Seems we agree here:
1. Changing docs to mention any invalid character will lead to result
being 0. No idea what the policy is about needing to raise exceptions in
case of errors - in this particular case, returning 0 should actually be
enough by itself as there is no Roman numeral that stands for the value 0...
2. The other cases you mentioned in your OP are bugs, not features.

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: StrUtils.RomanToInt oddities

Bart-48
On 9/20/13, Reinier Olislagers <[hidden email]> wrote:

> 2. The other cases you mentioned in your OP are bugs, not features.

Unless the Greek do the same, someone with recent Delphi should test.
If that's the case we're doomed to follw that.

Bart
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: StrUtils.RomanToInt oddities

Bart-48
And please, before anyone gets offended by my postings in this thread:

- I just stubled on this whils doing a kind WTF implementation of
something like this (and at that time I didn't even knew we had
StrUtils.RomanToInt).
- I don't think it is the end of the world just because the StrUtils
version behaves different than I expected it to.
- I do not want to pretend my interpretation of "valid Roman numeral"
is the absolute truth.

I was just curious as to the why?

I have my own interpretation (in code), so if fpc's stays as is, no
problem whatsoever.
It (my version) is LGPL, so anyone can use and or contribute.

Sometimes I just like having these kind of discussions...

Bart
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: StrUtils.RomanToInt oddities

Bart-48
In reply to this post by Reinier Olislagers
On 9/20/13, Reinier Olislagers <[hidden email]> wrote:

> The question however becomes "what is the
> algorithm for deciding invalid characters" which IMO will become a mess
> very quickly. Much better to just consider the entire input as invalid.
>

Here's my implementation:

program test;

{$mode objfpc}
{$H+}

uses
  SysUtils, StrUtils;


function TryRomanToInt(S: String; out N: Integer): Boolean;
var
  i, Len: Integer;
  Terminated: Boolean;
begin
  Result := (False);
  S := UpperCase(S);  //don't use AnsiUpperCase please
  Len := Length(S);
  if (Len = 0) then Exit;
  i := 1;
  N := 0;
  Terminated := False;
  //leading M's
  while (i <= Len) and (S[i] = 'M') do
  begin
    //writeln('TryRomanToInt: Found 1000');
    Inc(i);
    N := N + 1000;
  end;
  //then CM or or CD or D or (C, CC, CCC, CCCC)
  if (i <= Len) and (S[i] = 'D') then
  begin
    //writeln('TryRomanToInt: Found 500');
    Inc(i);
    N := N + 500;
  end
  else if (i + 1 <= Len) and (S[i] = 'C') then
  begin
    if (S[i+1] = 'M') then
    begin
      //writeln('TryRomanToInt: Found 900');
      Inc(i,2);
      N := N + 900;
    end
    else if (S[i+1] = 'D') then
    begin
      //writeln('TryRomanToInt: Found 400');
      Inc(i,2);
      N := N + 400;
    end;
  end ;
  //next max 4 C's
  if (i <= Len) and (S[i] = 'C') then
  begin
    //find max 4 C's
    //writeln('TryRomanToInt: Found 100');
    Inc(i);
    N := N + 100;
    if (i <= Len) and (S[i] = 'C') then
    begin
      //writeln('TryRomanToInt: Found 100');
      Inc(i);
      N := N + 100;
    end;
    if (i <= Len) and (S[i] = 'C') then
    begin
      //writeln('TryRomanToInt: Found 100');
      Inc(i);
      N := N + 100;
    end;
    if (i <= Len) and (S[i] = 'C') then
    begin
      //writeln('TryRomanToInt: Found 100');
      Inc(i);
      N := N + 100;
    end;
  end;

  //then XC or XL
  if (i + 1 <= Len) and (S[i] = 'X') then
  begin
    if (S[i+1] = 'C') then
    begin
      //writeln('TryRomanToInt: Found 90');
      Inc(i,2);
      N := N + 90;
    end
    else if  (S[i+1] = 'L') then
    begin
      //writeln('TryRomanToInt: Found 40');
      Inc(i,2);
      N := N + 40;
    end;
  end;

  //then L
  if (i <= Len) and (S[i] = 'L') then
  begin
    //writeln('TryRomanToInt: Found 50');
    Inc(i);
    N := N + 50;
  end;

  //then (X, xx, xxx, xxxx)
  if (i <= Len) and (S[i] = 'X') then
  begin
    //find max 4 X's
    //writeln('TryRomanToInt: Found 10');
    Inc(i);
    N := N + 10;
    if (i <= Len) and (S[i] = 'X') then
    begin
      //writeln('TryRomanToInt: Found 10');
      Inc(i);
      N := N + 10;
    end;
    if (i <= Len) and (S[i] = 'X') then
    begin
      //writeln('TryRomanToInt: Found 10');
      Inc(i);
      N := N + 10;
    end;
    if (i <= Len) and (S[i] = 'X') then
    begin
      //writeln('TryRomanToInt: Found 10');
      Inc(i);
      N := N + 10;
    end;
  end;

  //then IX or IV
  if (i + 1 <= Len) and (S[i] = 'I') then
  begin
    if (S[i+1] = 'X') then
    begin
      Terminated := (True);
      //writeln('TryRomanToInt: Found 9');
      Inc(i,2);
      N := N + 9;
    end
    else if (S[i+1] = 'V') then
    begin
      Terminated := (True);
      //writeln('TryRomanToInt: Found 4');
      Inc(i,2);
      N := N + 4;
    end;
  end;

  //then V
  if (not Terminated) and (i <= Len) and (S[i] = 'V') then
  begin
    //writeln('TryRomanToInt: Found 5');
    Inc(i);
    N := N + 5;
  end;


  //then I
  if (not Terminated) and (i <= Len) and (S[i] = 'I') then
  begin
    Terminated := (True);
    //writeln('TryRomanToInt: Found 1');
    Inc(i);
    N := N + 1;
    //Find max 3 closing I's
    if (i <= Len) and (S[i] = 'I') then
    begin
      //writeln('TryRomanToInt: Found 1');
      Inc(i);
      N := N + 1;
    end;
    if (i <= Len) and (S[i] = 'I') then
    begin
      //writeln('TryRomanToInt: Found 1');
      Inc(i);
      N := N + 1;
    end;
    if (i <= Len) and (S[i] = 'I') then
    begin
      //writeln('TryRomanToInt: Found 1');
      Inc(i);
      N := N + 1;
    end;
  end;

  //writeln('TryRomanToInt: Len = ',Len,' i = ',i);
  Result := (i > Len);
  //if Result then writeln('TryRomanToInt: N = ',N);

end;

var
  S: String;
  N1, N2: Integer;
  B: Boolean;

begin
  repeat
    write('Enter Roman numeral: ');
    readln(S);
    B := TryRomanToInt(S, N1);
    if B then
      write('TryRomanToInt(''',S,''') -> ',N1)
    else
      write('TryRomanToInt(''',S,''') FAIL');
    writeln;
    N2 := StrUtils.RomanToInt(S);
    writeln('StrUtils.RomanToInt(''',S,''') = ',N2);
    if B and (N1 <> N2) then writeln('StrUtils.RomanToInt <> TryRomanToInt');
    writeln;
  until S = '';
end.

Bart
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: StrUtils.RomanToInt oddities

el_es
On 20/09/13 19:49, Bart wrote:
> On 9/20/13, Reinier Olislagers <[hidden email]> wrote:
>
>> The question however becomes "what is the
>> algorithm for deciding invalid characters" which IMO will become a mess
>> very quickly. Much better to just consider the entire input as invalid.
>>
>
> Here's my implementation:

Here is how I would write it ;)

function TryRomanToInt(AInput:String; out AResult: integer):boolean;
var i, Len, N, Np1 : integer;

  function TryRomanCharToInt(AIn: Char; out AOut: integer): boolean;
  begin
    case AIn of
      'M' : AOut := 1000;
      'D' : AOut := 500;
      'C' : AOut := 100;
      'L' : AOut := 50;
      'X' : AOut := 10;
      'V' : AOut := 5;
      'I' : AOut := 1;
    else
      AOut := 0;
    end;
    Result := (AOut > 0);
  end;

begin
  // if it looks like shameless c&p, it's because it is ;)
  Result := False;
  AInput := UpperCase(AInput);  //don't use AnsiUpperCase please
  Len := Length(AInput);
  if (Len = 0) then Exit;
  //
  i := 1;
  for i := 1 to Len-1 do
  begin
    if not TryRomanCharToInt(AInput[i], N) then
    begin
      AResult := -1; // invalidate everything
      exit;
    // writeln('Not a valid roman numeral at position ',i);
    end;

    if (i > Len -1) then
      begin
        if not TryRomanCharToInt(AInput[i+1], Np1) then
        begin
          AResult := -1; // invalidate everithing
          exit;
        // writeln('Not a valid roman numeral at position ',i+1);
        end;
         
        // according to wikipedia, wonder if these are hard or soft rules thou...
        if not (((N = 100) and (Np1 > 100)) or  // C can be placed before L or M
                ((N = 10) and (Np1 > 10) and (Np1 <= 100)) or // X can be placed before L or C
                ((N = 1) and (Np1 > 1) and (Np1 <= 10)))  // I can be placed before V and X
        then
          begin
            AResult := -1; // invalidate everithing: catches MDCLXVIVXLDM
            exit;
            // writeln('Not a valid roman numeral combination at position ',i, ' and ',i+1);
          end;
 

        if N >= Np1 then AResult := AResult + N
        else AResult := AResult - N;
       
      end;
    else // i = Len-1 = last char we just add (consider : MCM : add 1000, sub 100, add 1000 = 1900)
     begin
       AResult := AResult + N;
     end;
  end; // for
  // if the above, after all characters are through, has resulted in 0 or less,
  // we invalidate everything at the end (consider : CMLM, IIIM )
 
  Result := AResult > 0;
  if not Result then AResult := 0;
end;

(only mind-compiled ;) tests welcome ;) )

-L

>
> program test;
>
> {$mode objfpc}
> {$H+}
>
> uses
>   SysUtils, StrUtils;
>
>
> function TryRomanToInt(S: String; out N: Integer): Boolean;
> var
>   i, Len: Integer;
>   Terminated: Boolean;
> begin
>   Result := (False);
>   S := UpperCase(S);  //don't use AnsiUpperCase please
>   Len := Length(S);
>   if (Len = 0) then Exit;
>   i := 1;
>   N := 0;
>   Terminated := False;
>   //leading M's
>   while (i <= Len) and (S[i] = 'M') do
>   begin
>     //writeln('TryRomanToInt: Found 1000');
>     Inc(i);
>     N := N + 1000;
>   end;
>   //then CM or or CD or D or (C, CC, CCC, CCCC)
>   if (i <= Len) and (S[i] = 'D') then
>   begin
>     //writeln('TryRomanToInt: Found 500');
>     Inc(i);
>     N := N + 500;
>   end
>   else if (i + 1 <= Len) and (S[i] = 'C') then
>   begin
>     if (S[i+1] = 'M') then
>     begin
>       //writeln('TryRomanToInt: Found 900');
>       Inc(i,2);
>       N := N + 900;
>     end
>     else if (S[i+1] = 'D') then
>     begin
>       //writeln('TryRomanToInt: Found 400');
>       Inc(i,2);
>       N := N + 400;
>     end;
>   end ;
>   //next max 4 C's
>   if (i <= Len) and (S[i] = 'C') then
>   begin
>     //find max 4 C's
>     //writeln('TryRomanToInt: Found 100');
>     Inc(i);
>     N := N + 100;
>     if (i <= Len) and (S[i] = 'C') then
>     begin
>       //writeln('TryRomanToInt: Found 100');
>       Inc(i);
>       N := N + 100;
>     end;
>     if (i <= Len) and (S[i] = 'C') then
>     begin
>       //writeln('TryRomanToInt: Found 100');
>       Inc(i);
>       N := N + 100;
>     end;
>     if (i <= Len) and (S[i] = 'C') then
>     begin
>       //writeln('TryRomanToInt: Found 100');
>       Inc(i);
>       N := N + 100;
>     end;
>   end;
>
>   //then XC or XL
>   if (i + 1 <= Len) and (S[i] = 'X') then
>   begin
>     if (S[i+1] = 'C') then
>     begin
>       //writeln('TryRomanToInt: Found 90');
>       Inc(i,2);
>       N := N + 90;
>     end
>     else if  (S[i+1] = 'L') then
>     begin
>       //writeln('TryRomanToInt: Found 40');
>       Inc(i,2);
>       N := N + 40;
>     end;
>   end;
>
>   //then L
>   if (i <= Len) and (S[i] = 'L') then
>   begin
>     //writeln('TryRomanToInt: Found 50');
>     Inc(i);
>     N := N + 50;
>   end;
>
>   //then (X, xx, xxx, xxxx)
>   if (i <= Len) and (S[i] = 'X') then
>   begin
>     //find max 4 X's
>     //writeln('TryRomanToInt: Found 10');
>     Inc(i);
>     N := N + 10;
>     if (i <= Len) and (S[i] = 'X') then
>     begin
>       //writeln('TryRomanToInt: Found 10');
>       Inc(i);
>       N := N + 10;
>     end;
>     if (i <= Len) and (S[i] = 'X') then
>     begin
>       //writeln('TryRomanToInt: Found 10');
>       Inc(i);
>       N := N + 10;
>     end;
>     if (i <= Len) and (S[i] = 'X') then
>     begin
>       //writeln('TryRomanToInt: Found 10');
>       Inc(i);
>       N := N + 10;
>     end;
>   end;
>
>   //then IX or IV
>   if (i + 1 <= Len) and (S[i] = 'I') then
>   begin
>     if (S[i+1] = 'X') then
>     begin
>       Terminated := (True);
>       //writeln('TryRomanToInt: Found 9');
>       Inc(i,2);
>       N := N + 9;
>     end
>     else if (S[i+1] = 'V') then
>     begin
>       Terminated := (True);
>       //writeln('TryRomanToInt: Found 4');
>       Inc(i,2);
>       N := N + 4;
>     end;
>   end;
>
>   //then V
>   if (not Terminated) and (i <= Len) and (S[i] = 'V') then
>   begin
>     //writeln('TryRomanToInt: Found 5');
>     Inc(i);
>     N := N + 5;
>   end;
>
>
>   //then I
>   if (not Terminated) and (i <= Len) and (S[i] = 'I') then
>   begin
>     Terminated := (True);
>     //writeln('TryRomanToInt: Found 1');
>     Inc(i);
>     N := N + 1;
>     //Find max 3 closing I's
>     if (i <= Len) and (S[i] = 'I') then
>     begin
>       //writeln('TryRomanToInt: Found 1');
>       Inc(i);
>       N := N + 1;
>     end;
>     if (i <= Len) and (S[i] = 'I') then
>     begin
>       //writeln('TryRomanToInt: Found 1');
>       Inc(i);
>       N := N + 1;
>     end;
>     if (i <= Len) and (S[i] = 'I') then
>     begin
>       //writeln('TryRomanToInt: Found 1');
>       Inc(i);
>       N := N + 1;
>     end;
>   end;
>
>   //writeln('TryRomanToInt: Len = ',Len,' i = ',i);
>   Result := (i > Len);
>   //if Result then writeln('TryRomanToInt: N = ',N);
>
> end;
>
> var
>   S: String;
>   N1, N2: Integer;
>   B: Boolean;
>
> begin
>   repeat
>     write('Enter Roman numeral: ');
>     readln(S);
>     B := TryRomanToInt(S, N1);
>     if B then
>       write('TryRomanToInt(''',S,''') -> ',N1)
>     else
>       write('TryRomanToInt(''',S,''') FAIL');
>     writeln;
>     N2 := StrUtils.RomanToInt(S);
>     writeln('StrUtils.RomanToInt(''',S,''') = ',N2);
>     if B and (N1 <> N2) then writeln('StrUtils.RomanToInt <> TryRomanToInt');
>     writeln;
>   until S = '';
> end.
>
> Bart
> _______________________________________________
> fpc-pascal maillist  -  [hidden email]
> http://lists.freepascal.org/mailman/listinfo/fpc-pascal
>


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: StrUtils.RomanToInt oddities

Bart-48
On 9/23/13, Lukasz Sokol <[hidden email]> wrote:

> function TryRomanToInt(AInput:String; out AResult: integer):boolean;
> var i, Len, N, Np1 : integer;
[snip]
>
>
>         if N >= Np1 then AResult := AResult + N
>         else AResult := AResult - N;
>
>       end;
>     else // i = Len-1 = last char we just add (consider : MCM : add 1000,

Compiler doesn't like the else here. I removed the ; after the end 1 line above.

> sub 100, add 1000 = 1900)
>      begin
>        AResult := AResult + N;
>      end;
>   end; // for
>   // if the above, after all characters are through, has resulted in 0 or
> less,
>   // we invalidate everything at the end (consider : CMLM, IIIM )
>
>   Result := AResult > 0;
>   if not Result then AResult := 0;
> end;
>
> (only mind-compiled ;) tests welcome ;) )
>

It produces "odd" output though:

C:\Users\Bart\LazarusProjecten\ConsoleProjecten>test
Enter Roman numeral: LL  //should be illegal
TryRomanToInt('LL') -> 50
StrUtils.RomanToInt('LL') = 100

Enter Roman numeral: MM
TryRomanToInt('MM') -> 1050  //obviously should be 2000
StrUtils.RomanToInt('MM') = 2000

Enter Roman numeral: II
TryRomanToInt('II') -> 1051  // nice one!
StrUtils.RomanToInt('II') = 2

So, it's a little bit flawed.

Anyhow, let's not focus upon what the new code should be.
The question was: is current behaviour (accepting IIMIIC etc.) a bug or not.

Bart
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: StrUtils.RomanToInt oddities

Alberto Narduzzi
In reply to this post by Bart-48
Premise: I didn't go through your entire implementation, thou' at first
sight looks much better than the current one.

I know roman numerals since I was 8 or 9 yo, that makes a quick and dirt
result of about 35 years.

Thing is:

1. in the roman numerals, not all the digits can be subtracted from the
others
2. no more than three same roman numerals can appear in a row.

so that XM is as invalid as CCCC.
(although this amenity can be seen on some - in my opinion poor -
wristwatches' face that uses roman numerals, for the 4h, showing IIII
instead of the correct IV, but this is another story)

in roman numerals we have only these digits:

I(1)   V(5)
X(10)  L(50)
C(100) D(500)
M(1000) (sorry, no digit for the expected progression, for 5000)

the only digits that can be subtracted are I, X and C (only those on the
left side of the "table" above), and they can only from the 2 digits
immediately following them in the system, so that:

I can be subtracted only from V and X
X can be subtracted only from L and C
C can be subtracted only from D and M

so that 45 cannot be written as VL (50 - 5, as one might suspect) but
must be written as XLV, or to better understand, as (XL)V = 40 + 5

The maximum roman numeral that can be represented is 4999, that is NOT
MMMIM, but instead: MMM(CM)(XC)(IX)
1000+1000+1000+(1000-900)+(100-10)+(10-1)

After this, roman numerals always begin with bigger numbers to sum up,
and proceed to smaller, so you cannot start adding up 10, then 500, then
again 1; like XDI (whether it be intended to be 491 or 511 - just no way)

I did an implementation some time ago, for pure fun, and provided I find
it out (or maybe I can just rewrite it, on the basis of this knowledge)
I will be - repository allowing ;-) - wery happy to contribute with my
2c for it.


Cheers, A.


On 20/09/13 21:49, Bart wrote:

> On 9/20/13, Reinier Olislagers<[hidden email]>  wrote:
>
>> The question however becomes "what is the
>> algorithm for deciding invalid characters" which IMO will become a mess
>> very quickly. Much better to just consider the entire input as invalid.
>>
>
> Here's my implementation:
>
> program test;
>
> {$mode objfpc}
> {$H+}
>
> uses
>    SysUtils, StrUtils;
>
>
> function TryRomanToInt(S: String; out N: Integer): Boolean;
> var
>    i, Len: Integer;
>    Terminated: Boolean;
> begin
>    Result := (False);
>    S := UpperCase(S);  //don't use AnsiUpperCase please
>    Len := Length(S);
>    if (Len = 0) then Exit;
>    i := 1;
>    N := 0;
>    Terminated := False;
>    //leading M's
>    while (i<= Len) and (S[i] = 'M') do
>    begin
>      //writeln('TryRomanToInt: Found 1000');
>      Inc(i);
>      N := N + 1000;
>    end;
>    //then CM or or CD or D or (C, CC, CCC, CCCC)
>    if (i<= Len) and (S[i] = 'D') then
>    begin
>      //writeln('TryRomanToInt: Found 500');
>      Inc(i);
>      N := N + 500;
>    end
>    else if (i + 1<= Len) and (S[i] = 'C') then
>    begin
>      if (S[i+1] = 'M') then
>      begin
>        //writeln('TryRomanToInt: Found 900');
>        Inc(i,2);
>        N := N + 900;
>      end
>      else if (S[i+1] = 'D') then
>      begin
>        //writeln('TryRomanToInt: Found 400');
>        Inc(i,2);
>        N := N + 400;
>      end;
>    end ;
>    //next max 4 C's
>    if (i<= Len) and (S[i] = 'C') then
>    begin
>      //find max 4 C's
>      //writeln('TryRomanToInt: Found 100');
>      Inc(i);
>      N := N + 100;
>      if (i<= Len) and (S[i] = 'C') then
>      begin
>        //writeln('TryRomanToInt: Found 100');
>        Inc(i);
>        N := N + 100;
>      end;
>      if (i<= Len) and (S[i] = 'C') then
>      begin
>        //writeln('TryRomanToInt: Found 100');
>        Inc(i);
>        N := N + 100;
>      end;
>      if (i<= Len) and (S[i] = 'C') then
>      begin
>        //writeln('TryRomanToInt: Found 100');
>        Inc(i);
>        N := N + 100;
>      end;
>    end;
>
>    //then XC or XL
>    if (i + 1<= Len) and (S[i] = 'X') then
>    begin
>      if (S[i+1] = 'C') then
>      begin
>        //writeln('TryRomanToInt: Found 90');
>        Inc(i,2);
>        N := N + 90;
>      end
>      else if  (S[i+1] = 'L') then
>      begin
>        //writeln('TryRomanToInt: Found 40');
>        Inc(i,2);
>        N := N + 40;
>      end;
>    end;
>
>    //then L
>    if (i<= Len) and (S[i] = 'L') then
>    begin
>      //writeln('TryRomanToInt: Found 50');
>      Inc(i);
>      N := N + 50;
>    end;
>
>    //then (X, xx, xxx, xxxx)
>    if (i<= Len) and (S[i] = 'X') then
>    begin
>      //find max 4 X's
>      //writeln('TryRomanToInt: Found 10');
>      Inc(i);
>      N := N + 10;
>      if (i<= Len) and (S[i] = 'X') then
>      begin
>        //writeln('TryRomanToInt: Found 10');
>        Inc(i);
>        N := N + 10;
>      end;
>      if (i<= Len) and (S[i] = 'X') then
>      begin
>        //writeln('TryRomanToInt: Found 10');
>        Inc(i);
>        N := N + 10;
>      end;
>      if (i<= Len) and (S[i] = 'X') then
>      begin
>        //writeln('TryRomanToInt: Found 10');
>        Inc(i);
>        N := N + 10;
>      end;
>    end;
>
>    //then IX or IV
>    if (i + 1<= Len) and (S[i] = 'I') then
>    begin
>      if (S[i+1] = 'X') then
>      begin
>        Terminated := (True);
>        //writeln('TryRomanToInt: Found 9');
>        Inc(i,2);
>        N := N + 9;
>      end
>      else if (S[i+1] = 'V') then
>      begin
>        Terminated := (True);
>        //writeln('TryRomanToInt: Found 4');
>        Inc(i,2);
>        N := N + 4;
>      end;
>    end;
>
>    //then V
>    if (not Terminated) and (i<= Len) and (S[i] = 'V') then
>    begin
>      //writeln('TryRomanToInt: Found 5');
>      Inc(i);
>      N := N + 5;
>    end;
>
>
>    //then I
>    if (not Terminated) and (i<= Len) and (S[i] = 'I') then
>    begin
>      Terminated := (True);
>      //writeln('TryRomanToInt: Found 1');
>      Inc(i);
>      N := N + 1;
>      //Find max 3 closing I's
>      if (i<= Len) and (S[i] = 'I') then
>      begin
>        //writeln('TryRomanToInt: Found 1');
>        Inc(i);
>        N := N + 1;
>      end;
>      if (i<= Len) and (S[i] = 'I') then
>      begin
>        //writeln('TryRomanToInt: Found 1');
>        Inc(i);
>        N := N + 1;
>      end;
>      if (i<= Len) and (S[i] = 'I') then
>      begin
>        //writeln('TryRomanToInt: Found 1');
>        Inc(i);
>        N := N + 1;
>      end;
>    end;
>
>    //writeln('TryRomanToInt: Len = ',Len,' i = ',i);
>    Result := (i>  Len);
>    //if Result then writeln('TryRomanToInt: N = ',N);
>
> end;
>
> var
>    S: String;
>    N1, N2: Integer;
>    B: Boolean;
>
> begin
>    repeat
>      write('Enter Roman numeral: ');
>      readln(S);
>      B := TryRomanToInt(S, N1);
>      if B then
>        write('TryRomanToInt(''',S,''') ->  ',N1)
>      else
>        write('TryRomanToInt(''',S,''') FAIL');
>      writeln;
>      N2 := StrUtils.RomanToInt(S);
>      writeln('StrUtils.RomanToInt(''',S,''') = ',N2);
>      if B and (N1<>  N2) then writeln('StrUtils.RomanToInt<>  TryRomanToInt');
>      writeln;
>    until S = '';
> end.
>
> Bart
> _______________________________________________
> fpc-pascal maillist  -  [hidden email]
> http://lists.freepascal.org/mailman/listinfo/fpc-pascal
>

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: StrUtils.RomanToInt oddities

Alberto Narduzzi
Ooops,

I also forgot to mention that only I, X, C and M can appear up to three
times in a row. V, L and D does not; they can only once.


Cheers again, A.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: StrUtils.RomanToInt oddities

Sven Barth-2
On 23.09.2013 21:17, Alberto Narduzzi wrote:
> Ooops,
>
> I also forgot to mention that only I, X, C and M can appear up to three
> times in a row. V, L and D does not; they can only once.

Are you sure regarding M considering there is no symbol for 5000? Or
didn't Romans count to more than 5000 - 1?

Regards,
sven

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: StrUtils.RomanToInt oddities

Bart-48
In reply to this post by Alberto Narduzzi
On 9/23/13, Alberto Narduzzi <[hidden email]> wrote:

> Thing is:
>
> 1. in the roman numerals, not all the digits can be subtracted from the
> others
> 2. no more than three same roman numerals can appear in a row.

My implementation is a little more relaxed as to rule 2.
It is more often "violated" than rule 1.
Typically IIII is not al that uncommon.


> I did an implementation some time ago, for pure fun, and provided I find
> it out (or maybe I can just rewrite it, on the basis of this knowledge)
> I will be - repository allowing ;-) - wery happy to contribute with my
> 2c for it.

By all means.
The princial question must be answered first though.

Bart
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: StrUtils.RomanToInt oddities

Bart-48
In reply to this post by Sven Barth-2
On 9/23/13, Sven Barth <[hidden email]> wrote:

> Are you sure regarding M considering there is no symbol for 5000? Or
> didn't Romans count to more than 5000 - 1?
>

There are numerous extensions upon the scheme.
In later times adding horizontal lines above or under a Roman numeral
meant multiply by 1000.

I don't believe though Romans knew negative numbers.
But I'm certainly not an expert on the matter.

Bart
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: StrUtils.RomanToInt oddities

Frederic Da Vitoria
In reply to this post by Bart-48
2013/9/23 Bart <[hidden email]>
The question was: is current behaviour (accepting IIMIIC etc.) a bug or not.

What about making an option of it?

Anyhow, if the function accepts invalid combinations, what should it return? For some, the answer would be obvious (IIII), but some combinations are indeed ambiguous (IIIIM) So that we maybe could accept unambiguous invalid combinations, but I don't see how to accept ambiguous ones.

--
Frederic Da Vitoria
(davitof)

Membre de l'April - « promouvoir et défendre le logiciel libre » - http://www.april.org

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: StrUtils.RomanToInt oddities

Bart-48
On 9/23/13, Frederic Da Vitoria <[hidden email]> wrote:

> What about making an option of it?

It's up to the fpc devels.

> Anyhow, if the function accepts invalid combinations, what should it
> return? For some, the answer would be obvious (IIII), but some combinations
> are indeed ambiguous (IIIIM) So that we maybe could accept unambiguous
> invalid combinations, but I don't see how to accept ambiguous ones.

If it accepts, then it should apply the subtraction rule: only the
(one) numeral left to the other (bigger one) can be subtracted.
So IIIM would be III + IM = 3 + 1000-1 = 1002 (and not 1000-3 = 997)
There should be no ambiguity there, it's only a pain to watch it.

Bart
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: StrUtils.RomanToInt oddities

Alberto Narduzzi
>> What about making an option of it?

> If it accepts, then it should apply the subtraction rule: only the
> (one) numeral left to the other (bigger one) can be subtracted.
> So IIIM would be III + IM = 3 + 1000-1 = 1002 (and not 1000-3 = 997)
> There should be no ambiguity there, it's only a pain to watch it.

as I said, roman numerals what they are.

If you want to invent something new, nowadays, extending the roman
numerals to your like, well (growl!) feel free to do it, but do not
pretend to call them so.

IIIM has no meaning in roman numerals. FULL STOP. There is no such logic
as you depict it. 1002 = MII, and 997 = CMXCVII.
All the rest is pure bull**** in terms of roman numerals by the very name.

I begin to have a feeling that here somebody is merely talking nonsense;
say, like pretending to use the digit 2 in the binary system...

This is no decimal notation logic; things are as they are, so do not
pretend to give roman numerals a value according to their position in
the number itself; it is pure NONSENSE here.

Or maybe you have no clue about, so do not litter Pascal with something
you don't know anything about, which will only make you just ridicule;
possibly (hope not, thou') by the very Embarcadero/Delphi guys... which
is not so good for Freepascal, really.

Ah, BTW, what Delphi does (as I read maybe on the very first reply to
the topic, sic), here, makes no sense too. Roman numerals are what they
are, irrelevantly of what somebody *may think they are*.


A.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: StrUtils.RomanToInt oddities

Alberto Narduzzi
In reply to this post by Frederic Da Vitoria
> What about making an option of it?
>
> Anyhow, if the function accepts invalid combinations, what should it
> return? For some, the answer would be obvious (IIII), but some
> combinations are indeed ambiguous (IIIIM) So that we maybe could accept
> unambiguous invalid combinations, but I don't see how to accept
> ambiguous ones.

they are not ambiguous, they are simply INVALID.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: StrUtils.RomanToInt oddities

Alberto Narduzzi
In reply to this post by Bart-48
>> Are you sure regarding M considering there is no symbol for 5000? Or
>> didn't Romans count to more than 5000 - 1?

yes I am. as 5000 - 1 would need to be written MMMMCMXCIX, which has the
4-M-in-a-row, that is invalid, as a maximum of three is allowed.

Yes, probably ancient Romans had no need to count more that 3999
Sestertiae ;-)

> I don't believe though Romans knew negative numbers.
> But I'm certainly not an expert on the matter.

never though about, but if the only matter is DEBT, then just write
positive numbers under the DEBT column, and everybody shall be happy
too, provided they know the meaning of such a column ;-)

P.S.
Have no clue of roman arithmetics, thou', which looks a little more
weird to implement, or just to think about :-O


A.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
123