getting XML element's value using dom.pp unit

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

getting XML element's value using dom.pp unit

Bee-6
Hi all...

I'm currently developing a Lazarus application which need to read an XML file.
In order to do such a task, FPC has provided DOM unit (dom.pp). But, it seems
that dom.pp is unable to read XML element's (text) value. It succesfully read
element's name (through NodeName property) and attributes (name and value)
though. Getting element's name (through NodeValue property) always return an
empty string.

What did I do wrong? Or am I missing something about the dom.pp usage? Here are
the codes I used...

procedure TForm1.Button1Click(Sender: TObject);
var
   doc: TXMLDocument;
begin
   ReadXMLFile(doc, 'sample.xml');

   // this returns correct first attribute's name
   Memo1.Lines.Add(doc.DocumentElement.FirstChild.Attributes.Item[0].NodeName);
   // this returns correct first attribute's value
   Memo1.Lines.Add(doc.DocumentElement.FirstChild.Attributes.Item[0].NodeValue);
   // this returns correct first element's name
   Memo1.Lines.Add(doc.DocumentElement.FirstChild.FirstChild.NodeName);
   // this returns nothing !!!
   Memo1.Lines.Add(doc.DocumentElement.FirstChild.FirstChild.NodeValue);
end;

I really appreciate any kind of helps. I thank you in advance.

Regards,

-Bee-

has Bee.ography at
http://beeography.blogsome.com
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: getting XML element's value using dom.pp unit

Tony Pelton
i'm going to speak out of turn here ...

i've got a ton of XML experience ... on Java.

I haven't used the Pascal DOM, but i'm guessing i know what the issue
is, because it is a common point of confusion for people who don't
know XML DOM frameworks well.

i of course may end up being completely off base.

assuming you had a node in the document like :

...
<anode>my node value</anode>
...

i think you are going to find that "my node value" is not the
NodeValue() of <anode>, but is the NodeValue() of a _child_ of
<anode>.

so you have to step one deeper into the DOM.

i think ...

Tony

PS. also, including a sample of the XML document, along with the code
is usually helpful.

On 12/7/05, Bisma Jayadi <[hidden email]> wrote:
> Hi all...
>
>
> What did I do wrong? Or am I missing something about the dom.pp usage? Here are
> the codes I used...
>
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: getting XML element's value using dom.pp unit

Sebastian Günther
In reply to this post by Bee-6
Bisma Jayadi schrieb:

> Hi all...
>
> I'm currently developing a Lazarus application which need to read an XML
> file. In order to do such a task, FPC has provided DOM unit (dom.pp).
> But, it seems that dom.pp is unable to read XML element's (text) value.
> It succesfully read element's name (through NodeName property) and
> attributes (name and value) though. Getting element's name (through
> NodeValue property) always return an empty string.
>
> What did I do wrong? Or am I missing something about the dom.pp usage?
> Here are the codes I used...
>
> procedure TForm1.Button1Click(Sender: TObject);
> var
>   doc: TXMLDocument;
> begin
>   ReadXMLFile(doc, 'sample.xml');
>
>   // this returns correct first attribute's name
>  
> Memo1.Lines.Add(doc.DocumentElement.FirstChild.Attributes.Item[0].NodeName);

btw, you can leave out the ".Item" part, as it's a default property. So
"Attributes[0]" would be sufficient.


>   // this returns correct first attribute's value
>  
> Memo1.Lines.Add(doc.DocumentElement.FirstChild.Attributes.Item[0].NodeValue);
>
>   // this returns correct first element's name
>   Memo1.Lines.Add(doc.DocumentElement.FirstChild.FirstChild.NodeName);
>   // this returns nothing !!!
>   Memo1.Lines.Add(doc.DocumentElement.FirstChild.FirstChild.NodeValue);

Perhaps it contains whitespace? It's possible that the XML parser
creates more than one text node.
Do you have an excerpt of your XML file? This would be very helpful.


- Sebastian
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: getting XML element's value using dom.pp unit

Bee-6
> Perhaps it contains whitespace? It's possible that the XML parser
> creates more than one text node.

Hmmm... it sounds weird if a whitespace would create (automatically) another
text node 'cause it would change the XML structures.

> Do you have an excerpt of your XML file? This would be very helpful.

Here's the complete XML file I was trying to read...

<?xml version="1.0" encoding="UTF-8"?>
<BAIS_XML Version='1.0'>
   <RESPOND ID="BAIS">
     <RESPOND_TIME>2005.01.20 12:26:58</RESPOND_TIME>
   </RESPOND>
   <REQUEST ID="SI1">
     <REQEUST_TIME>2005.01.20 12:26:58</REQUEST_TIME>
   </REQUEST>
   <CONTENT ID="L1">
     <USER ID="simba">
       <GROUP ID="1">Operator bank</GROUP>
       <NAME>Bisma Jayadi</NAME>
       <ALIAS>simba</ALIAS>
       <PASSPORT>FB0woDvKVE4AFQW29a9E</PASSPORT>
     </USER>
     <LOCATION ID="1">
       <CODE>UPPTI.1</CODE>
       <NAME>Komputer Bisma</NAME>
     </LOCATION>
   </CONTENT>
</BAIS_XML>

It was an XML-RPC reply actually, 'cause I'm making an XML-RPC client. I'm able
to get all the attributes's names and values, the elements's names, but not the
elements's value. :(

I'm using Lazarus 0.9.10 (with FPC v.2.0.1) on winXP SP2 and Linux FC4 (using
KDE) on a same machine using Intel P4 2.2 GHz and 256 MB memory. My code behaves
the same on both environments.

I'm now inspecting the dom.pp and xmlread.pp source to know what's really going
on. But it'd take some times to understand the code, mean while I need help from
this mailing-list.

Thanks.

-Bee-

has Bee.ography at
http://beeography.blogsome.com
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: getting XML element's value using dom.pp unit

Bee-6
In reply to this post by Tony Pelton
 > i think you are going to find that "my node value" is not the
 > NodeValue() of <anode>, but is the NodeValue() of a _child_ of
 > <anode>.

Thanks Tony... using your information, I've found what the problem is. It's a
different way of understanding XML node, between a Delphi XML (DOM) component
and a FPC DOM unit. I used to be a Delphi programmer, trying to migrate to be a
Lazarus/FPC programmer. :)

Here's the problem, using the XML I used.

<?xml version="1.0" encoding="UTF-8"?>
<BAIS_XML Version='1.0'>
   <RESPOND ID="BAIS">
     <RESPOND_TIME>2005.01.20 12:26:58</RESPOND_TIME>
   </RESPOND>
   <REQUEST ID="SI1">
     <REQEUST_TIME>2005.01.20 12:26:58</REQUEST_TIME>
   </REQUEST>
   <CONTENT ID="L1">
     <USER ID="simba">
       <GROUP ID="1">Operator bank</GROUP>
       <NAME>Bisma Jayadi</NAME>
       <ALIAS>simba</ALIAS>
       <PASSPORT>FB0woDvKVE4AFQW29a9E</PASSPORT>
     </USER>
     <LOCATION ID="1">
       <CODE>UPPTI.1</CODE>
       <NAME>Komputer Bisma</NAME>
     </LOCATION>
   </CONTENT>
</BAIS_XML>

To get "value" of "RESPOND_TIME" element, in Delphi I use this code...

DELPHI CODE SNIPPET #1:
n := doc.DocumentElement.ChildNodes[0].ChildNodes[0].NodeName;
v := doc.DocumentElement.ChildNodes[0].ChildNodes[0].NodeValue;

which works this way...

the DocumentElement points to "BAIS_XML" element (root), the first ChildNodes[0]
points to "RESPOND" element, the next ChildNodes[0] points to "RESPOND_TIME"
element. So, NodeName returns 'RESPOND_TIME' (into "n" var) and NodeValue
returns '2005.01.20 12:26:58' (into "v" var). The node's name and value is
stored in the same tree-depth.

Using the way/logic of Delphi points the XML elements, I use a "similar" way
when I code using the FPC's dom.pp unit...

FPC CODE SNIPPET #1:
n := doc.DocumentElement.FirstChild.FirstChild.NodeName;
v := doc.DocumentElement.FirstChild.FirstChild.NodeValue;

which I _assume_ the DocumentElement points to "BAIS_XML" element (root), the
first FirstChild points to "RESPOND" element, the next FirstChild points to
"RESPOND_TIME" element. So, I expect (like I used to get it from Delphi) the "n"
var will have 'RESPOND_TIME' string and the 'v' var will have '2005.01.20
12:26:58' string. But, instead... I got an empty string on "v" var.

After I change the code into this...

FPC CODE SNIPPET #2:
n := doc.DocumentElement.FirstChild.FirstChild.NodeName;
v := doc.DocumentElement.FirstChild.FirstChild.FirstChild.NodeValue;

I got all what I want! :) The "n" var filled with 'RESPOND_TIME' and the "v" var
filled with '2005.01.20 12:26:58'. :)

When I try this code...

FPC CODE SNIPPET #3:
n := doc.DocumentElement.FirstChild.FirstChild.FirstChild.NodeName;
v := doc.DocumentElement.FirstChild.FirstChild.FirstChild.NodeValue;

I got '#text' string on "n" variable. This confirms Tony's information that
node value that I'm looking for is actually a child node (with "#text" as the
node's name). :)

Interestingly... the _correct_ FPC code (#2) is also accepted and works well in
Delphi! While the FPC code #3 in Delphi resulting var "n" filled with '#text'. :)

As a Delphi ex-programmer... the FPC dom.pp unit behavior is pretty weird in my
opinion. With this experience, I got a new lesson about XML. Thank you, Tony. :)

But then I question the _correct_ DOM framework itself. Why does an element have
to have a "#text" node while it is obviously invisible within the document? By
using this framework, it is logically possible to have multiple values/childs
within a single element. Say the first child is "#text" and the second is
"#image". But, how we wrote (and distinguished) them in the file while the
"#text" node (and the "#image") is invisible? To me, the "#text" node is useless
and confusing!

Interrestingly Delphi DOM component overcomes the confusion by allowing to
access node's value in the same depth with its name. It simply convert the XML
document structure into the equal tree-hierarchy based on the _visible_ text.
Delphi makes the _invisible_ nodes invisible, but still accessible. No matter
which "framework" you're using, the _correct_ one or the visible document
tree-hierarchy, Delphi DOM works as you expected. IMO, Delphi way is a lot more
simpler, clearer, and straight-forward. :)

-Bee-

has Bee.ography at
http://beeography.blogsome.com

PS: Sorry for the long email... I'm really excited with this topic because I
also had lots of XML experience in Delphi, errr... using the Delphi "way". :)
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: getting XML element's value using dom.pp unit

Tony Pelton
On 12/8/05, Bisma Jayadi <[hidden email]> wrote:
>  > i think you are going to find that "my node value" is not the
>  > NodeValue() of <anode>, but is the NodeValue() of a _child_ of
>  > <anode>.
>
> Thanks Tony... using your information, I've found what the problem is. It's a
> different way of understanding XML node, between a Delphi XML (DOM) component
> and a FPC DOM unit. I used to be a Delphi programmer, trying to migrate to be a
> Lazarus/FPC programmer. :)
.
> I got '#text' string on "n" variable. This confirms Tony's information that
> node value that I'm looking for is actually a child node (with "#text" as the
> node's name). :)

i'm not sure ... but i think that having the "#text" child node is per
the spec :

http://www.w3.org/XML/

in any event, as i said, every Java DOM parser i've worked with does
it that way.

glad to have helped.

> -Bee-

Tony
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: getting XML element's value using dom.pp unit

Felipe Monteiro de Carvalho
In reply to this post by Bee-6
Hi,

On 12/8/05, Bisma Jayadi <[hidden email]> wrote:
> Thanks Tony... using your information, I've found what the problem is. It's a
> different way of understanding XML node, between a Delphi XML (DOM) component
> and a FPC DOM unit. I used to be a Delphi programmer, trying to migrate to be a
> Lazarus/FPC programmer. :)

Can you improve the already existing article about XML on Lazarus documentation?

It is located here: http://wiki.lazarus.freepascal.org/index.php/Networking

On the existing article and I think I put some code witch tryes to get
the NodeValue, but does the same error you were doing initially.

This way others witch try to use XML DOM on Free Pascal will benefit
from your experience =P

--
Felipe Monteiro de Carvalho
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: getting XML element's value using dom.pp unit

Bee-6
> Can you improve the already existing article about XML on Lazarus documentation?

I'll give it a try. Does it need some kind of approval from the wiki admin to
submit an article?

> This way others witch try to use XML DOM on Free Pascal will benefit
> from your experience =P

Yup... don't let other Delphi developers have the same experience like me, it's
really annoying since you don't have any clue how it could happen. :D

-Bee-

has Bee.ography at
http://beeography.blogsome.com


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: getting XML element's value using dom.pp unit

Felipe Monteiro de Carvalho
On 12/10/05, Bisma Jayadi <[hidden email]> wrote:
> I'll give it a try. Does it need some kind of approval from the wiki admin to
> submit an article?

No. Just hit the "Edit button" and edit the space reserved for XML.

--
Felipe Monteiro de Carvalho
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal