Hi,
I want to create console app that's using box drawing characters from unicode. Before CRT unit is used, it's all fine and my program could draw table beautifully. But once I put CRT unit, those characters became garbages. But strangely, it's only happen on Windows' terminal (win10). I tried the same exact program in Mac and Linux, using each CRT unit, and they all run fine. I need CRT unit to make my console program more interactive (i.e. cursor positioning, keyboard handling, text coloring, etc). So, what's wrong with CRT unit on Windows? Can anybody explain the strange behaviour and how to solve the problem? Thank you. Regards, –Mr Bee _______________________________________________ fpc-pascal maillist - [hidden email] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal |
>I want to create console app that's using box drawing characters from unicode. Before CRT unit is used, it's all fine and my program could draw >table beautifully. But once I put CRT unit, those characters became garbages. But strangely, it's only happen on Windows' terminal (win10). >I tried the same exact program in Mac and Linux, using each CRT unit, and they all run fine. I need CRT unit to make my console program more >interactive (i.e. cursor positioning, keyboard handling, text coloring, etc). >So, what's wrong with CRT unit on Windows? Can anybody explain the strange behaviour and how to solve the problem? From my own failed attempts at getting box characters to work with both crt and ptccrt I can confirm that it’s not working in windows, but I am actually very surprised to hear it’s working on Linux and Mac. Perhaps there is a possibility to fix the CRT unit on Windows. In my opinion, the CRT unit SHOULD be drawing boxes with the ORGINAL Extended ASCII codes, not Unicode, at least by default. The CRT unit is supposed to be compatible with the original turbo pascal CRT unit, and in turbo pascal if you writeln(Chr(201)) you get a upper left corner, not a funny looking E, Perhaps there could be an option to use the unicode characters, but displaying the correct ASCII characters should be the default. Unfortunatly it’s not doing either correctly… if it’s going to use Unicode it should use the entire character set, if it’s not going to do that, then it should be using extended ASCII as the original CRT unit did. The strange thing is, my ancient turbo pascal program that draws ACSII boxes, looks fine in the FPC Text mode IDE, (which itself uses box characters!) but when running the program, even while in the same console window that FPC text mode ide is currently running in, I get the unicode characters. If I’m in the textmode IDE and I enter a character sequence of ALT+201 I get the upper right corner symbol, not a Unicode symbol… but run the program, and I get the Unicode sysmbol. If I don’t use the CRT unit, I do indeed get the full Unicode character set, but then you can’t use cursor positioning etc… I have tried to replace the ASCII characters in my program with the Unicode characters, but I can’t because they are so far out in the table, they are beyond the 255 character limit of the CRT unit; For example an upper right corner character used to be #201, now it is #9556. I can’t find any way to display #9556 with the CRT unit, I‘ve tried Writeln(Chr(9556)) but chr() has a limit of 255, and I’ve tried just Writeln(#9556) and while that compiles and runs, it doesn’t produce the correct character.. I have a feeling (but have not tested it) that it keeps cycling around the first 256 characters if you use anything above 255….. pretty sure a character is defined as a byte here. I have attached screen shots and a sample program that demonstrates this frustrating situation. I have simply abandoned a quite a few really good console applications because of this. Box characters are great, I don’t understand why it’s become so difficult to incorporate them. But I’m really hoping a solution can be found as I have reports I display in console windows, even on my graphics programs that would be so much nicer if I could use box characters again, since I make use of the console window while the graphics window is also open. James _______________________________________________ fpc-pascal maillist - [hidden email] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal ![]() ![]() ![]() |
On 13/10/17 14:39, James Richters wrote:
> I‘ve tried Writeln(Chr(9556)) but chr() has a limit of 255, and I’ve > tried just Writeln(#9556) and while that compiles and runs, it doesn’t > produce the correct character.. I have a feeling (but have not tested > it) that it keeps cycling around the first 256 characters if you use > anything above 255….. pretty sure a character is defined as a byte here. Thanks for that lengthy description of the problem, much better than OP's just describing output as rubbish, or something similar. since char is a single byte type a large value will just get truncated. If you turn on range checking it should return a compiler time error. You could check that. Maybe doing that would cast some light on what is going wrong in CRT as well. P _______________________________________________ fpc-pascal maillist - [hidden email] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal |
2017-10-14 14:12 GMT+07:00 <[hidden email]>:
But I don't use char() nor #xxx to output the box chars. No casting, no truncation, no nothing, because I also suspect that could be problematic. I just write the unicode char directly into the string, you know… a normal string. Regards, –Mr Bee _______________________________________________ fpc-pascal maillist - [hidden email] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal |
In reply to this post by pascalX
It may be worth looking at the SetTextCodePage in the System unit. On
Windows targets I often find it necessary to include SetTextCodePage(output,cp_utf8); to avoid problems writing UTF-8 to the console. On 14/10/17 08:12, [hidden email] wrote: > On 13/10/17 14:39, James Richters wrote: >> I‘ve tried Writeln(Chr(9556)) but chr() has a limit of 255, and I’ve >> tried just Writeln(#9556) and while that compiles and runs, it >> doesn’t produce the correct character.. I have a feeling (but have >> not tested it) that it keeps cycling around the first 256 characters >> if you use anything above 255….. pretty sure a character is defined >> as a byte here. > > > Thanks for that lengthy description of the problem, much better than > OP's just describing output as rubbish, or something similar. > > since char is a single byte type a large value will just get > truncated. If you turn on range checking it should return a compiler > time error. You could check that. > > Maybe doing that would cast some light on what is going wrong in CRT > as well. > > P > > _______________________________________________ > fpc-pascal maillist - [hidden email] > http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal > _______________________________________________ fpc-pascal maillist - [hidden email] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal |
In reply to this post by Zaaphod
for what ever reason, your code arrived here looking to be in UTF8 or ISO-8859-1 format... lazarus loaded is as ASCII but the box characters in the code were obviously not the ASCII ones you are speaking of (eg: ALT-201)... i tried to convert it to UTF8 in lazarus but the look was still the same... on my UTF8 xterm console, the characters look the same as in the code when the program is run... FWIW: even your routine separator lines were converted... can you place the original file online somewhere for download, please? -- NOTE: No off-list assistance is given without prior approval. *Please keep mailing list traffic on the list unless* *a signed and pre-paid contract is in effect with us.* _______________________________________________ fpc-pascal maillist - [hidden email] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal |
In reply to this post by Zaaphod
I had a similar problem when trying to represent trees using extended ASCII characters. My solution was to adapt my character representations to UTF8. To see what I mean you can have a look at https://github.com/svpantazi/catalan-monoid-generator. The generate_catalan_monoid.pp contains a PrintASCIITree function that does the obvious. The examples folder contains actual outputs (e.g., examples/n_5_output_example.txt) of ASCII trees such as this one (I pasted it here but may not survive the message):
0─┬─1─┬─12───123 │ └─13───132 ├─2─┬─21───213───2132 │ └─23 └─3───32───321 On Fri, Oct 13, 2017 at 9:39 AM, James Richters <[hidden email]> wrote:
_______________________________________________ fpc-pascal maillist - [hidden email] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal |
In reply to this post by wkitty42
>for what ever reason, your code arrived here looking to be in UTF8 or ISO-8859-1 format... lazarus loaded is as ASCII but the box characters in
>the code were obviously not the ASCII ones you are speaking of (eg: ALT-201)... i tried to convert it to UTF8 in lazarus but the look was still the >same... on my UTF8 xterm console, the characters look the same as in the code when the program is run... >FWIW: even your routine separator lines were converted... can you place the original file online somewhere for download, please? I have put the program for download here: http://www.productionautomation.net/FPC/ASCII/ascii%20box.pas Note that the characters in the program are the box characters from here: http://www.asciitable.com/ They are the box characters form the 'extended ASCII Codes' ranging from #179 to #218 when displayed in almost any windows editor, that will still be #179 to #218, but now they are mapped to different characters. I do not believe UTF-8 are compatible with codes in this range from the original extended ASCII codes. If you want the characters to display properly, use the FreePascal Textmode IDE (type fp from a command prompt. ) I can only display the fonts properly in an editor by either using the FPC textmode IDE editor, or Notepad ONLY with an ASCII font I have. Lazarus uses windows fonts, so it displays #201 as É. I have made a simpler set of test programs and compiled some test results and posted them to my website, that thoroughly illustrate this issue. http://www.productionautomation.net/FPC/ASCII/Asciibox2.pas Is a test program with just a bunch of writeln's making box characters. It is run 2 ways, first without UNIT CRT; with the results here: http://www.productionautomation.net/FPC/ASCII/ASCII no CRT.jpg In these tests, I opened the file (with Unit CRT commented out) and ran It from Lazarus, FPC textmode IDE, Turbo Pascal 7.0, and also showed how the file looks with Notepad++, and normal notepad with ASCII font, and normal notepad with Consolas font. You can see from this test that the output is the same for Lazarus, FPC textmode IDE, and Turbo Pascal 7.0. http://www.productionautomation.net/FPC/ASCII/ASCII With CRT.jpg In these tests I ran with the CRT unit, Again I compiled and ran it from Lazarus, FPC textmode IDE and Turbo Pascal 7.0. From this we see that Lazarus and FPC textmode IDE display the Unicode versions of the characters instead of extended Ascii... but Turbo Pascal 7.0 displays the same Ascii Characters that it does without CRT. http://www.productionautomation.net/FPC/ASCII/Asciibox3.pas This is the program again but using Unicode characters. You can see in the following tests that now the characters display correctly in Lazarus, but not FPC IDE or Turbo Pascal 7.0, that is because they are way outside the range of FPC IDE and Turbo Pascal 7.0, so they all actually show up as 3 characters. http://www.productionautomation.net/FPC/ASCII/Unicode NO CRT.jpg Is a compilation of all the tests above, it shows that NOTHING displays this correctly!! Even though in Lazarus they look right, they are still wrong when it runs, it's also wrong in FPC IDE, and Turbo Pascal IDE http://www.productionautomation.net/FPC/ASCII/Unicode with CRT.jpg In these tests I turned the CRT unit on, but the characters are still never right. Strangely, the results are not exactly the same when characters are specified with #codes So I did another series of tests to demonstrate this: http://www.productionautomation.net/FPC/ASCII/Asciibox4.pas This test uses the Unicode #codes in the Writeln command. http://www.productionautomation.net/FPC/ASCII/Unicode Codes NO CRT.jpg Now, you can see, without the CRT unit the codes are displayed properly with FPC, of course they aren't the Turbo Pascal 7.0 because they are not valid codes for TP7, but it's surprising they actually work in FPC when a writeln with the characters directly didn't work. http://www.productionautomation.net/FPC/ASCII/Unicode Codes With CRT.jpg Even stranger that the fact that the codes work without CRT is what happens WITH CRT, someone must have re-mapped them to show kind-of a box using + for corners, etc... but why not map them to the correct box characters instead?? http://www.productionautomation.net/FPC/ASCII/Asciibox5.pas Just for completeness, I also used the Extended ASCII codes in a similar method to the previous test. http://www.productionautomation.net/FPC/ASCII/ASCII Codes No CRT.jpg Results are expected without CRT on both FPC and turbo pascal 7.0, the box characters are correctly represented. http://www.productionautomation.net/FPC/ASCII/ASCII Codes With CRT.jpg When CRT is used the results are the same incorrect symbols as when CRT is used with a direct Writeln; http://www.productionautomation.net/FPC/ASCII/PTCPAS ASCII box.JPG I was wrong about PTCCRT, it does display Extended ASCII codes, but the reason it's useless is because it uses Borland BGI fonts, and they are not fixed width, so nothing lines up. In my opinion, the CRT unit should be displaying the same original Extended ASCII characters that are displayed in the version without CRT... and the same as they are displayed on Turbo Pascal 7.0... but even if it was decided that the CRT unit use Unicode, then it should do that Correctly, which it does not. Whether CRT is using Extended ASCII or Unicode, as far as I can figure out, it's impossible to display box characters with the CRT unit at all. James _______________________________________________ fpc-pascal maillist - [hidden email] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal |
2017-10-16 3:14 GMT+07:00 James Richters <[hidden email]>:
Long story short: CRT unit doesn't work at all on Windows, but it works with unicode on Mac/Linux. Regards, –Mr Bee _______________________________________________ fpc-pascal maillist - [hidden email] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal |
>Long story short: CRT unit doesn't work at all on Windows, but it works with unicode on Mac/Linux. I’m curious how the CRT unit handles the Extended ASCII codes on Mac and Linux… All versions should be the same for cross platform compatibility. I also noticed that email broke my links with spaces in them, so I made an index link with all the links and a zip file with all test programs and screen shots. Index File http://www.productionautomation.net/FPC/ASCII/index.htm Zip file http://www.productionautomation.net/FPC/ASCII/BoxCharacters.zip James. _______________________________________________ fpc-pascal maillist - [hidden email] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal |
On 16/10/17 13:03, James Richters wrote:
> >Long story short: CRT unit doesn't work at all on Windows, but it > works with unicode on Mac/Linux. > > I’m curious how the CRT unit handles the Extended ASCII codes on Mac and > Linux… All versions should be the same for cross platform compatibility. > > I also noticed that email broke my links with spaces in them, so I made > an index link with all the links and a zip file with all test programs > and screen shots. > > Index File > > http://www.productionautomation.net/FPC/ASCII/index.htm > > Zip file > > http://www.productionautomation.net/FPC/ASCII/BoxCharacters.zip > > James. > > A URL can have a space in it but must be encoded "percent two zero" , if you copy what is displayed in the address bar you probably get a space char not the encode version. Short answer is don't use spaces is file names ;) That will avoid having to put quotes around everything or relying on GUI software which silently does this for you or encodes them or whatever. Since spaces are the primary delimiter on computer input, it makes sense to avoid using them in something you do not want delimiting. For CRT, it would be nice if this could be made to work consistently across platforms. _______________________________________________ fpc-pascal maillist - [hidden email] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal |
In reply to this post by Zaaphod
2017-10-16 19:03 GMT+07:00 James Richters <[hidden email]>:
I don't know the underlayer library being used by CRT unit on Mac and Linux. Perhaps because CRT unit both on Mac and Linux had supported unicode since the beginning? CMIIW. Regards, –Mr Bee _______________________________________________ fpc-pascal maillist - [hidden email] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal |
_______________________________________________ fpc-pascal maillist - [hidden email] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal |
Free forum by Nabble | Edit this page |