Website mashup with Apache Camel

July 22, 2011 Posted by jbonofre

Mashup ?

You are browsing on some websites and you see an interesting information, that you want to poll to be used into your system.

Unfortunately, you don’t know the website provider, and you don’t know if a “plug” is provided, for instance a WebService.

So you have to find a way to get the information.

Using Camel

You can create a Camel route looking like:

  <route>
    <from uri="timer:fire?period=2000"/>
    <setHeaders>
        <constant>POST</constant>
    </setHeaders>
    <to uri="http:blog.nanthrax.net?param=value"/>
    <unmarshal>
        <tidyMarkup/>
    </unmarshal>
    <setBody><xpath>//span[@class='date']</xpath></setBody>
    <to uri="log:blog"/>
  </route>

Here, every 2 seconds, we access to blog.nanthrax.net to get the HTML source. We can eventually provide some parameters (with POST method here).
On this HTML, we use tidy markup (from camel-tagsoup component) to cleanup the HTML code and format it in XML.
After that, we extract from the source, only the content of element.
Finally, we send the filtered content into the log.

About jbonofre

ASF Member, PMC for Apache Karaf, PMC for Apache ServiceMix, PMC for Apache ACE, PMC for Apache Syncope, Committer for Apache ActiveMQ, Committer for Apache Archiva, Committer for Apache Camel, Contributor for Apache Falcon Twitter: jbonofre IRC: jbonofre on #servicemix,#karaf,#camel,#cxf on Freenode

Leave a Reply