Char, WideChar, String and WideString in FPC/Lazarus

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Char, WideChar, String and WideString in FPC/Lazarus

Borut Maricic
I would like to ask some questions regarding the following:

>>
A NOTE has been added to this issue.
======================================================================
http://www.freepascal.org/mantis/view.php?id=7122
======================================================================
Date Submitted:             2006-07-06 15:45 CEST
Last Modified:              2006-07-06 15:52 CEST
======================================================================
Summary:                    Array with char index could not compile
Description:
An array that uses Char for its index could not compile when index is a
cyrillic character. Perhaps not working with all characters except
English.
See program source.<<

What is the exact definition of the type Char and String in FPC?

Please confirm or deny my following statement: "A variable of type String
may contain an UTF-8 encoded string, i.e. a programmer using UTF-8 in
strings should keep in mind the consequences (no 1:1 mapping between
graphems and bytes, i.e. iterating over bytes is NOT equal iterating over
graphems encoded in the string bytes)."

I am not new to Unicode etc. and would like to use this opportunity to
clarify these issues. Thanks.



_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Char, WideChar, String and WideString in FPC/Lazarus

Jonas Maebe-2

On 6 jul 2006, at 16:29, Borut Maricic wrote:

> What is the exact definition of the type Char and String in FPC?

A char is a 1 byte value which can contain the values chr(0) till chr
(255). A (short)string is a length byte followed by 1 to 255 chars.  
No assumptions are made about encodings.

> Please confirm or deny my following statement: "A variable of type  
> String
> may contain an UTF-8 encoded string, i.e. a programmer using UTF-8 in
> strings should keep in mind the consequences (no 1:1 mapping between
> graphems and bytes, i.e. iterating over bytes is NOT equal  
> iterating over
> graphems encoded in the string bytes)."

Correct.


Jonas


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal