| XML stands for EXtensible Markup Language and it is | | | | of this file is given below. |
| a simplified subset of Standard Generalized Markup | | | | |
| Language (SGML). Its primary purpose is to facilitate | | | | SoftArea51 - latest XML & CSS Utilities software for |
| the sharing of data across different information | | | | Windows |
| systems, particularly systems connected via the | | | | Try and buy latest XML & CSS Utilities software for |
| Internet. | | | | Windowsen-us |
| RSS is a Web content syndication format. Its name is | | | | SoftArea51 - latest XML & CSS Utilities software for |
| an acronym for Really Simple Syndication. In other | | | | Windows |
| words, RSS is a lightweight XML format designed for | | | | Try and buy latest XML & CSS Utilities software for |
| sharing headlines and other Web content. | | | | Windows |
| More details about RSS 2.0 specification can be found | | | | Feed Mix |
| at | | | | Feed Mix is a feature-rich RSS editor with the unique |
| Very often people want to read rss files and display | | | | ability to create a new RSS feed from several others |
| the content on their site using a custom layout. This | | | | that already exist... |
| article represents a complete guide to the entire | | | | RSS Submit |
| process of parsing RSS 2.0 files using PHP. | | | | RSS Submit is the most powerful RSS feed |
| Requirements: | | | | promotion tool available... |
| In order to be able to test the code in this tutorial | | | | PAD-Script |
| we need to have installed a web server (I am using | | | | Avoid having to update all your PAD files whenever |
| Apache: configured with support for PHP ( | | | | the PAD format changes... |
| You can find lots of articles and tutorials on the web | | | | PAD Data Extractor Tool |
| on how to install Apache and PHP. | | | | Data Doctor XML PAD information extractor |
| Available method for parsing an XML file. | | | | software tools extract important data from online |
| Currently there are two methods used by developers | | | | website XML file... |
| to read XML files, no matter what the programming | | | | In order to get the useful data from the RSS file we |
| language might be: SAX (Simple API for XML) and | | | | need to loop through the item nodes and extract the |
| DOM (Document Object Model). I will shortly describe | | | | information we need. |
| each of these methods and finally choose the best | | | | Below you can find the script for parsing the above |
| for us. | | | | RSS feeds:getElementsByTagName('item') as $node) |
| SAX (Simple API for XML) is an event based API. | | | | {array_push($arrFeeds, array ( 'title' => |
| Every time a tag is opened or closed, or any time | | | | Value, |
| the parser finds some text, it makes callbacks to | | | | 'description' => |
| user-defined functions for each event with the node | | | | ->nodeValue, |
| or text information. | | | | 'link' => |
| The advantage of a SAX parser is that it's really | | | | alue, |
| lightweight. The parser doesn't keep anything in | | | | 'date' => |
| memory for very long, so it can be used for | | | | nodeValue |
| extremely large files. The disadvantage is that writing | | | | )); |
| SAX parser event function can take some time and | | | | } |
| coding experience. | | | | ?> |
| The DOM (Document Object Model) defines a | | | | The script starts by creating a new DOMDocument |
| standard way for accessing and manipulating XML | | | | object and loading the RSS file into that object using |
| documents. The DOM presents an XML document as | | | | the load method. After that, the script uses the |
| a tree-structure (a node tree), with the elements, | | | | getElementsByName method to get a list of all of |
| attributes, and text defined as nodes. | | | | the elements with the given name (in our case 'item'). |
| An API implementing DOM standard will read the | | | | Within the loop of the item nodes, the script uses |
| entire XML document into memory and provide a set | | | | the getElementsByName method to get the |
| of functions for manipulating the data. The drawback | | | | nodeValue for the title, description, link and date tags. |
| of this powerful method is that is not recommended | | | | The nodeValue is the text within the node. An array |
| for large XML documents, which would take too | | | | is used to store each set of values and each array |
| much memory to build the model of the document. | | | | represents an entry in the big array that holds our |
| Because usually people are dealing with normal size | | | | structured RSS data. |
| files and not everybody has the necessary time or | | | | As you can see, the job was easy enough. All the |
| skills to write an entire SAX parser we'll use the DOM | | | | data is now hold by the $arrFeeds array, it is well |
| method. | | | | structured and you can display it using the desired |
| So let's get started. | | | | layout. |
| As a RSS example we'll use the following file: A part | | | | |