JAVA SCRIPT - Extracting Pertinent Information from an XML Tree - Supercoders | Web Development and Design | Tutorial for Java, PHP, HTML, Javascript JAVA SCRIPT - Extracting Pertinent Information from an XML Tree - Supercoders | Web Development and Design | Tutorial for Java, PHP, HTML, Javascript

Breaking

Post Top Ad

Post Top Ad

Thursday, January 3, 2019

JAVA SCRIPT - Extracting Pertinent Information from an XML Tree

Extracting Pertinent Information from an XML Tree


Problem

You want to access individual pieces of data from an XML document. 

Solution

Use the same DOM methods you use to query your web page elements to query the XML document. For example, the following will get all elements that have a resource tag name:

var resources = xmlHttpObj.responseXML.getElementsByTagName("resource");

EXPLAIN

When you have a reference to an XML document, you can use the DOM methods to query any of the data in the document. It’s not as simple as accessing data from a JSON object, but it’s vastly superior to extracting data from a large piece of just plain text. 

To demonstrate working with an XML document, Contains a Node.js (commonly referred to simply as Node) application that returns XML containing three resources. Each resource contains a title and a url. It’s not a complicated application or a complex XML result, but it’s sufficient to generate an XML document. Notice that a MIME type of text/xml is given in the header, and the Access-Control-

Allow-Origin header value is set to accept queries from all do‐ mains (*). Because the Node application is running at a different port than the web page querying it, we have to set this value in order to allow cross-domain requests.

Node.js server application that returns an XML result


var http = require('http'),
 url = require('url');
var XMLWriter = require('xml-writer');
// start server, listen for requests
var server = http.createServer().listen(8080);
server.on('request', function(req, res) {
 var xw = new XMLWriter;
 // start doc and root element
 xw.startDocument().startElement("resources");
 // resource
 xw.startElement("resource");
 xw.writeElement("title","Ecma-262 Edition 6");
 xw.writeElement("url",
 "http://wiki.ecmascript.org/doku.php?id=harmony:specification_drafts");
 xw.endElement();
 // resource
 xw.startElement("resource");
 xw.writeElement("title","ECMA-262 Edition 5.1");
 xw.writeElement("url",
 "http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf");
 xw.endElement();
 // resource
 xw.startElement("resource");
 xw.writeElement("title", "ECMA-402");
 xw.writeElement("url",
 "http://ecma-international.org/ecma-402/1.0/ECMA-402.pdf");
 xw.endElement();
 // end resources
 xw.endElement();
 res.writeHeader(200, {"Content-Type": "application/xml",
 "Access-Control-Allow-Origin": "*"});
 res.end(xw.toString(),"utf8");
});


Most Ajax calls process plain text or JSON, but there’s still a need for processing XML. SVG is still XML, as is MathML, XHTML, and other markup languages.

In the solution, a new XMLHttpRequest object is created to handle the client-server communication. If you’ve not used Ajax previously, the XMLHttpRequest object’s methods are:

• open: Initializes a request. Parameters include the method (GET, POST, DELETE, or PUT), the request URL, whether the request is asynchronous, and a possible username and password. By default, all requests are sent asynchronously.
• setRequestHeader: Sets the MIME type of the request.
• send: Sends the request.
• sendAsBinary: Sends binary data.
• abort: Aborts an already sent request.
• getResponseHeader: Retrieves the header text, or null if the response hasn’t been returned yet or there is no header.
• getAllResponseHeaders: Retrieves the header text for a multipart request

The communication is opened using object’s open() method, passing in the HTTP method (GET), the request URL (the Node application), as well as a value of true, signaling that the communication is asynchronous (the application doesn’t block wait‐ ing on the return request). If the application is password protected, the fourth and fifth optional parameters are the username and password, respectively.

I know that the application I’m calling is returning an XML-formatted response, so it’s not necessary to override the MIME type In the application, the XMLHttpRequest’s onReadyStateChange event handler is assigned a callback function, getData(), and then the request is sent with send(). If the HTTP method had been POST, the prepared data would have been sent as a parameter of send().

In the callback function getData(), the XMLHttpRequest object’s readyState and status properties are checked . Only when the readyState is 4 and status is 200 is the result processed.

The readyState indicates what state the Ajax call is in, and the value of 200 is the HTTP OK response code. Because we know the result is XML, the application accesses the XML document via the XMLHttpRequest object’s responseXML property.

For other data types, the data is accessed via the response prop‐ erty, and responseType provides the data type (arrayBuffer, blob, document, json, text). Not all browsers support all data types, but all modern browsers do support XML and at least arrayBuffer, JSON, and text.

Application to process resources from returned XML 


<!DOCTYPE html>
<html>
<head>
 <title>Stories</title>
 <meta charset="utf-8" />
</head>
<body>
 <div id="result">
 </div>
<script type="text/javascript">
 var xmlHttpObj;
 // ajax object
 if (window.XMLHttpRequest) {
 xmlRequest = new XMLHttpRequest();
 }
 // build request
 var url = "http://shelleystoybox.com:8080";
 xmlRequest.open('GET', url, true);
 xmlRequest.onreadystatechange = getData;
 xmlRequest.send();
 function getData() {
 if (xmlRequest.readyState == 4 && xmlRequest.status == 200) {
 try {
 var result = document.getElementById("result");
 var str = "<p>";
 // can use DOM methods on XML document
 var resources =
 xmlRequest.responseXML.getElementsByTagName("resource");
 // process resources
 for (var i = 0; i < resources.length; i++) {
 var resource = resources[i];
 // get title and url, generate HTML
 var title = resource.childNodes[0].firstChild.nodeValue;
 var url = resource.childNodes[1].firstChild.nodeValue;
 str += "<a href='" + url + "'>" + title + "</a><br />";
 }
 // finish HTML and insert
 str+="</p>";
 result.innerHTML=str;
 } catch (e) {
 console.log(e.message);
 }
 }
 }
</script>
</body>
</html>

When processing the XML code, the application first queries for all resource elements, returned in a nodeList. The application cycles through the collection, accessing each resource element in order to access the title and url, both of which are child nodes. Each is accessed via the childNodes collection, and their data, contained in the node Value attribute, is extracted.

The resource data is used to build a string of linked resources, which is output to the page using innerHTML. Instead of using a succession of childNodes element collections to walk the trees, I could have used the Selectors API to access all URLs and titles, and then traversed both collections at one time, pulling the paired values from each, in sequence:

var urls = xmlRequest.responseXML.querySelectorAll("resource url");
var titles = xmlRequest.responseXML.querySelectorAll("resource title");
for (var i = 0; i < urls.length; i++) {
 var url = urls[i].firstChild.nodeValue;
 var title = titles[i].firstChild.nodeValue;
 str += "" + title + "
";
}

I could have also used getElementsByTagName against each returned resource element —any XML DOM method that works with the web page works with the returned XML. The try…catch error handling should catch any query that fails because the XML is incomplete.

No comments:

Post a Comment

Post Top Ad