Portable social network profile parser (Beta 0.1.1)

This parser is a combination of the ufXtract microformats parser and a spider which follows rel=”me” links. It returns two main collections of data, all the rel=”me” links and any hCard-XFN patterns it finds. Each collection item is given an additional source url attribute. You can restrict the spider to a single domain or spider across the web. Currently there are limits to the number of pages which will be parsed.

This is a piece of research work still under development. If you have any comments or want to point out issues please email - info.backnetwork.com

(Max 5)

(Max 20)

 

Updated

29-Nov-07
Added support for concept of representative-hcard. I have extended the idea to cover the parsing of multiple pages. You will often find multiple representative-hcard in the output, but there will always only be one per a Url. Also added support for pages encoded with ISO-8859-1

 

Example Xml output
<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<ufxtract>
  <me sourceurl="http://www.glennjones.net/">
    <text>Glenn Jones</text>
    <link>http://www.glennjones.net/about</link>
  </me>
  <me sourceurl="http://www.glennjones.net/about/">
    <text>Twitter</text>
    <link>http://twitter.com/glennjones</link>
  </me>
  <vcard sourceurl="http://www.glennjones.net/about/" representativehcard="true">
    <fn>Glenn Jones>/fn>
    <n>
      <given-name>Glenn</given-name>
      <family-name>Jones</family-name>
    </n>
    <url>http://www.glennjones.net/</url>
    <org>Madgex</org>
    <role>Creative Director</role>
    <xfn>
      <text>Glenn Jones</text>
      <link>http://www.glennjones.net/about/</link>
      <rel>me</rel>
    </xfn>
  </vcard>
  <vcard sourceurl="http://www.glennjones.net/about/">
    <fn>Jeremy Keith>/fn>
    <n>
      <given-name>Jeremy</given-name>
      <family-name>Keith</family-name>
    </n>
    <xfn>
      <text>Jeremy Keith</text>
      <link>http://adactio.com/journal/</link>
      <rel>friend met</rel>
    </fxn>
  </vcard>
  <report>
    <url status="200" millisec="109">http://www.glennjones.net/</url>
    <url status="200" millisec="179">http://www.glennjones.net/about/</url>
    <found>4</found>
  </report>
</ufxtract>

 

Example Xml error
<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<ufxtract>
  <errors>
    <error>
      <msg>The remote name could not be resolved: 'htp'</msg>
      <url>http://htp://www.glennjones.net/</url>
    </error>
  </errors>
</ufxtract>
About
This site showcases some of the experimental work being carried out for backnetwork. By sharing this early work we hope in some way to add to the important technical and architectural discussions about portable social networks.