XML - Indent, text content, special char

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

XML - Indent, text content, special char

Gabor Boros-2
Hi All,

I have an existing XML file. After load(, modify) and save this file
some mandatory formatting things lost from it. I need same indent as
before, same text contents and not replace every special chars. With the
below code I got the attached OUTPUT.xml from the attached INPUT.xml.
Any idea how to solve this problem? (I use fixes_3_2.)

X:=TXMLDocument.Create;
ReadXMLFile(X,'INPUT.xml');
WriteXMLFile(X,'OUTPUT.xml');

Gabor

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

INPUT.xml (220 bytes) Download Attachment
OUTPUT.xml (205 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: XML - Indent, text content, special char

Michael Van Canneyt


On Sat, 27 Apr 2019, Gabor Boros wrote:

> Hi All,
>
> I have an existing XML file. After load(, modify) and save this file some
> mandatory formatting things lost from it. I need same indent as before, same
> text contents and not replace every special chars. With the below code I got
> the attached OUTPUT.xml from the attached INPUT.xml. Any idea how to solve
> this problem? (I use fixes_3_2.)
>
> X:=TXMLDocument.Create;
> ReadXMLFile(X,'INPUT.xml');
> WriteXMLFile(X,'OUTPUT.xml');

As far as I know you can't. I recently changed some things in xmlwriter so you can
influence the formatting to some degree, but no attempt is made to respect
the formatting of a previously read file. I believe the formatting info is
discarded on read (I would need to veify) so this would require lots of
rewriting.

Michael.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: XML - Indent, text content, special char

Gabor Boros-2
2019. 04. 27. 13:57 keltezéssel, Michael Van Canneyt írta:
> As far as I know you can't. I recently changed some things in xmlwriter
> so you can
> influence the formatting to some degree, but no attempt is made to respect
> the formatting of a previously read file. I believe the formatting info is
> discarded on read (I would need to veify) so this would require lots of
> rewriting.

The "indent" and "text content" problems solved on the reader side by
ReadXMLFilePreserveWhitespace:

http://wiki.freepascal.org/XML_Tutorial#Whitespace_characters

Gabor
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: XML - Indent, text content, special char

Michael Van Canneyt


On Sun, 28 Apr 2019, Gabor Boros wrote:

> 2019. 04. 27. 13:57 keltezéssel, Michael Van Canneyt írta:
>> As far as I know you can't. I recently changed some things in xmlwriter
>> so you can
>> influence the formatting to some degree, but no attempt is made to respect
>> the formatting of a previously read file. I believe the formatting info is
>> discarded on read (I would need to veify) so this would require lots of
>> rewriting.
>
> The "indent" and "text content" problems solved on the reader side by
> ReadXMLFilePreserveWhitespace:
>
> http://wiki.freepascal.org/XML_Tutorial#Whitespace_characters
Nice. Again something learned.

In that case, disabling indent (set size=0) and CRLF (set lineending to
0-empty string) in the newly exposed XMLWriter will probably do the trick.

Michael.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: XML - Indent, text content, special char

Gabor Boros-2
2019. 04. 28. 9:35 keltezéssel, Michael Van Canneyt írta:
> the newly exposed XMLWriter


Will be merged/backported into fixes_3_2?

Gabor
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: XML - Indent, text content, special char

Michael Van Canneyt


On Sun, 28 Apr 2019, Gabor Boros wrote:

> 2019. 04. 28. 9:35 keltezéssel, Michael Van Canneyt írta:
>> the newly exposed XMLWriter
>
>
> Will be merged/backported into fixes_3_2?

I just merged it.

Michael.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: XML - Indent, text content, special char

Santiago A.
In reply to this post by Gabor Boros-2
El 27/04/19 a las 13:29, Gabor Boros escribió:
> Hi All,
>
> I have an existing XML file. After load(, modify) and save this file
> some mandatory formatting things lost from it. I need same indent as
> before, same text contents and not replace every special chars.

If you need the same indent or special chars, XML is not the right
format for you.

Consider it

--
Saludos

Santiago A.

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: XML - Indent, text content, special char

Gabor Boros-2
In reply to this post by Gabor Boros-2
2019. 04. 28. 9:25 keltezéssel, Gabor Boros írta:
> The "indent" and "text content" problems solved on the reader side by
> ReadXMLFilePreserveWhitespace:


With a sample application but not with the real life application. :-(
(My real application just find node for every second FindNode call.)

With the attached example and test xml file I got the next result:

*a*
**

Is this not a bug? Lost of formatting is not disturb me but text between
 > and < is the data/text content of a node.

Gabor

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

XML_Read.pas (251 bytes) Download Attachment
TEST.xml (103 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: XML - Indent, text content, special char

Gabor Boros-2
In reply to this post by Santiago A.
2019. 04. 28. 21:24 keltezéssel, Santiago A. írta:
> If you need the same indent or special chars, XML is not the right
> format for you.
>
> Consider it


XML is not my choice. ;-)

Gabor
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: XML - Indent, text content, special char

wkitty42
In reply to this post by Gabor Boros-2
On 4/29/19 1:27 PM, Gabor Boros wrote:
> Is this not a bug? Lost of formatting is not disturb me but text between > and <
> is the data/text content of a node.

are you saying that you are trying to use fixed-width fields that are
space-padded in XML files???


--
  NOTE: No off-list assistance is given without prior approval.
        *Please keep mailing list traffic on the list unless*
        *a signed and pre-paid contract is in effect with us.*
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: XML - Indent, text content, special char

Bernd Oppolzer

Am 30.04.2019 um 02:45 schrieb [hidden email]:
> On 4/29/19 1:27 PM, Gabor Boros wrote:
>> Is this not a bug? Lost of formatting is not disturb me but text
>> between > and < is the data/text content of a node.
>
> are you saying that you are trying to use fixed-width fields that are
> space-padded in XML files???
>
>

There is nothing wrong with that,
in fact I am aware of a typesetting system which uses XML as input,
and if you enclose your input text between certain tags (<pre> or
<sourcecode>)
you expect that the blanks between those tags are preserved.

The W3C standards, IMO, don't tell anything about what a certain parser
etc. should do to XML content (IMO, not much), there are not many
restrictions
(<, of course, should be &lt;) ... attributes are different, of course.

Kind regards

Bernd


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: XML - Indent, text content, special char

wkitty42
On 4/30/19 9:36 AM, Bernd Oppolzer wrote:
> Am 30.04.2019 um 02:45 schrieb [hidden email]:
>> On 4/29/19 1:27 PM, Gabor Boros wrote:
>>> Is this not a bug? Lost of formatting is not disturb me but text between >
>>> and < is the data/text content of a node.
>>
>> are you saying that you are trying to use fixed-width fields that are
>> space-padded in XML files???
>
> There is nothing wrong with that,

agreed...

> in fact I am aware of a typesetting system which uses XML as input,
> and if you enclose your input text between certain tags (<pre> or <sourcecode>)
> you expect that the blanks between those tags are preserved.

true but the given examples do not show that...

[code]
<?xml version="1.0" encoding="utf-8"?>
<doc>
   <text1>    a</text1>
   <text2>     </text2>
</doc>
[/code]



--
  NOTE: No off-list assistance is given without prior approval.
        *Please keep mailing list traffic on the list unless*
        *a signed and pre-paid contract is in effect with us.*
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: XML - Indent, text content, special char

Gabor Boros-2
In reply to this post by wkitty42
2019. 04. 30. 2:45 keltezéssel, [hidden email] írta:
> are you saying that you are trying to use fixed-width fields that are
> space-padded in XML files???

No. The XML files are exists already. The task is... Load the
contents/file to a TObject descendant, modify the data in the object(s)
then build up XMLs from it a save back to file(s). Without lost any
"TDOMNode.TextContent".

Gabor
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal