Helpful Information
 
 
Category: Programming Languages More
How about the parser??

Yes, thanks you to all here.....

Actually i got the source(data) from one site www.moreover.com (http://www.moreover.com)

they give me a xml formated file which prodived a lot of dayly update news link.

Now, i need some program, they call parser to get the source and print it out as normal html file......

MFK,

you can find a parser anywhere? do you use Perl, then use XML::Parser (search.cpan.org) to parse the XML. did moreover.com (for syndicated news content consider mediaxpress.com or isyndicate.com) give you an XML schema or DTD to tell you what the data will be like. probably and if so, you can feed that into XML: http://www.devshed.com/Talk/Forums/tongue.gifarser and use XML::Grove or something to create a forest of data.

enjoy,


<BLOCKQUOTE><font size="1" face="Verdana,Arial,Helvetica">quote:</font><HR>Originally posted by mfkoo:
Yes, thanks you to all here.....


Actually i got the source(data) from one site www.moreover.com (http://www.moreover.com)


they give me a xml formated file which prodived a lot of dayly update news link.


Now, i need some program, they call parser to get the source and print it out as normal html file......[/quote]




------------------
<anu />

sorry for the late reply....

yes they do give the xml compiled file like below..

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE moreovernews SYSTEM "http://p.moreover.com/xml_dtds/moreovernews.dtd">
<moreovernews>
<article id="_7821835">
<url>http://c.moreover.com/click/here.pl?x7821834</url>
<headline_text>Block Management Changes in Wake of Founder's Retirement</headline_text>
<source>Electronic Accountant</source>
<media_type>text</media_type>
<cluster>Accounting news</cluster>
<tagline> </tagline>
<document_url>http://www.electronicaccountant.com/html/newswire/newswire.htm</document_url>
<harvest_time>Jun 24 2000 6:33AM</harvest_time>
<access_registration> </access_registration>
<access_status> </access_status>
</article>
<article id="_7821837">
<url>http://c.moreover.com/click/here.pl?x7821836</url>
<headline_text>IRS Tax Shelter Regs Could Harm Small Practitioners</headline_text>
<source>Electronic Accountant</source>
<media_type>text</media_type>
<cluster>Accounting news</cluster>
<tagline> </tagline>
<document_url>http://www.electronicaccountant.com/html/newswire/newswire.htm</document_url>
<harvest_time>Jun 24 2000 6:33AM</harvest_time>
<access_registration> </access_registration>
<access_status> </access_status>
</article>
</moreovernews>

but now, i need to use php the parse that xml file, i dont know how.......can you give some idea....

look at the post below 'XML parser for devshednews.rdf' - its in PHP - look at how it works and you should soon be on your way, only thing is I am not so hot with regex so it could probably be optimised - any help appreciated

------------------
Simon Wheeler
FirePages -DHTML/PHP/MySQL