Set of encoding conversion routines

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Set of encoding conversion routines

Felipe Monteiro de Carvalho
Hello,

Does anyone know if there is a set of encoding conversion routines? If
there isn't, how should we add one? New directory in fpc/packages?

I just needed UTF-8 to Latin 1 ISO but I can't find a cross-platform
solution. Remembering that UTF8ToAnsi doesn't solve the problem
because it converts to the system encoding and not Latin 1 ISO.

There do is iconv, but does it already come installed in Windows? I
couldnt find any iconv.dll or something like that.

thanks,
--
Felipe Monteiro de Carvalho
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Set of encoding conversion routines

Marco van de Voort
In our previous episode, Felipe Monteiro de Carvalho said:
>
> Does anyone know if there is a set of encoding conversion routines? If
> there isn't, how should we add one? New directory in fpc/packages?

Cross platform, with a library?

The lowest level is unit charset in the rtl, Unix style iconv is supported
by package iconvenc.
 
> I just needed UTF-8 to Latin 1 ISO but I can't find a cross-platform
> solution. Remembering that UTF8ToAnsi doesn't solve the problem
> because it converts to the system encoding and not Latin 1 ISO.
>
> There do is iconv, but does it already come installed in Windows? I
> couldnt find any iconv.dll or something like that.

No. Non *nix OSes have their own api. There is an abstraction,
lconvencoding, but that is in the LCL afaik.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Set of encoding conversion routines

zaher dirkey
There is unit in Lazarus
lcl\LConvEncoding.pas
is that you mean?

--
Zaher Dirkey
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Set of encoding conversion routines

theo-6
In reply to this post by Felipe Monteiro de Carvalho

> I just needed UTF-8 to Latin 1 ISO but I can't find a cross-platform
> solution. Remembering that UTF8ToAnsi doesn't solve the problem
> because it converts to the system encoding and not Latin 1 ISO.
>
>  

A simpe option might be:

wides:=UTF8Decode(utf8s);
widestringmanager.Wide2AnsiMoveProc:=@defaultWide2AnsiMove;
ansis:=wides;


Maybe set back widestringmanager.Wide2AnsiMoveProc to the original proc
after that.

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Set of encoding conversion routines

Marco van de Voort
In reply to this post by zaher dirkey
In our previous episode, Zaher Dirkey said:
> There is unit in Lazarus
> lcl\LConvEncoding.pas
> is that you mean?

yes.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Set of encoding conversion routines

Felipe Monteiro de Carvalho
In reply to this post by Marco van de Voort
On Tue, Aug 11, 2009 at 12:43 PM, Marco van de Voort<[hidden email]> wrote:
> The lowest level is unit charset in the rtl, Unix style iconv is supported
> by package iconvenc.

The charset unit doesn't look complete or usable.

> No. Non *nix OSes have their own api. There is an abstraction,
> lconvencoding, but that is in the LCL afaik.

In the latest lazarus svn lconvencoding has real encoding conversion
tables! I don't know if they all work 100%, but I managed to strip LCL
dependencies and leave only the encoding I want and the conversion
from UTF-8 to Latin 1 ISO worked perfectly.

--
Felipe Monteiro de Carvalho
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Set of encoding conversion routines

zaher dirkey
I have my own on ported from IConv in
http://www.sourceforge.net/projects/minilib

check
minilib\Unicodes\source

I used in Wince project because WinCE not support my language locale
--
Zaher Dirkey
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Set of encoding conversion routines

Marco van de Voort
In reply to this post by Felipe Monteiro de Carvalho
In our previous episode, Felipe Monteiro de Carvalho said:
> On Tue, Aug 11, 2009 at 12:43 PM, Marco van de Voort<[hidden email]> wrote:
> > The lowest level is unit charset in the rtl, Unix style iconv is supported
> > by package iconvenc.
>
> The charset unit doesn't look complete or usable.

Afaik it is. The unit is just the main conversion, and can be plugged with
units that are generated directly from unicode.org tables.

Our own cache of those tables are in ucmaps, and the tool to make units out
of them is utils/creumap.pp. Examples for such units are compiler/cp*

The only problem is that it uses UTF16 internally iirc.
 
> > No. Non *nix OSes have their own api. There is an abstraction,
> > lconvencoding, but that is in the LCL afaik.
>
> In the latest lazarus svn lconvencoding has real encoding conversion
> tables! I don't know if they all work 100%, but I managed to strip LCL
> dependencies and leave only the encoding I want and the conversion
> from UTF-8 to Latin 1 ISO worked perfectly.

Maybe expanding charset with a utf16<=>utf8 conversion, and making a package
with units with some common encodings would do it also.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal