RDF Parser / Serializer plans

Gregory Williams greg at evilfunhouse.com
Thu Dec 17 18:00:02 CET 2009


On Dec 17, 2009, at 7:09 AM, Toby Inkster wrote:

> These will be called:
> 
> 	RDF::Trine::Parser::RDFa
> 	RDF::Trine::Parser::XRD
> 
> I'll release them together in a CPAN distribution called
> "RDF-Trine-Parser-Extras".

Great. I've been meaning to hook up your RDFa parser to RDF::Query so that doing things like "FROM <foo.html>" would work as expected. This would be a big step in that direction.

> Once version 0.01 of this is out, I plan on porting a large collection
> of parsers I wrote as part of the Swignition project to RDF::Trine.
> These will be added to the RDF-Trine-Parser-Extras package one by one as
> they are ready. These include:
> 
> 	* TriX, including support for XSLT stylesheets
> 	* GRDDL (XSLT and RDF-EASE transformations)
> 	* jsonGRDDL
> 	* eRDF
> 	* OpenURL COinS
> 	* Microformats: adr, figure, geo, hAtom, hAudio, hCalendar,
> 		hCard, hMeasure, hRecipe, hResume, hReview,
> 		rel-enclosure, rel-license, rel-tag, species, XFN,
> 		xoxo
> 
> There will also be an RDF::Trine::Parser::HTML module which parses an
> (X)HTML file and runs a number of relevant parsers on it (RDFa,
> Microformats, eRDF, GRDDL, etc).

Wow. Lots of goodies coming our way!

> And RDF::Trine::Parser::Guess will try to guess the right parser to use
> automagically.

That would be great. Another thing on my todo list is to get a guess parser working in concert with RDF::Query, but not solely based on "automagic". Ideally, we should get a guess parser working with mime type, and then some combination of filename or content sniffing.

> In terms of serializers I plan on doing a pretty similar thing. I've got
> serializers for:
> 
> 	* Pretty-printed RDF/XML.
> 	* TriX
> 	* Pretty-printed Turtle and N3

Sounds good, though take a look at the RDF/XML serializer in git for a step towards pretty-printing. It doesn't do rdf-type based elements ("<foaf:Person />"), but it does group predicate-object values into a single subject element.

Also, I've recently added some new methods to some of the serializers, and I'd like to standardize the API before we have too many more serializers to maintain. I added serialize_iterator_to_* methods to the RDF/XML serializer to allow serializing results of a pattern match (instead of only allowing serializing of whole models). Obviously this doesn't work easily for all serializers (the canonical ntriples serializer would probably have to create a temporary model to do this), but I think it's nice thing to provide. Let's try to sort out a standard set of methods we'd like all serializers to have and write it up in the R::T::Serializer POD.


> Again I'll do these in a distribution called something like
> "RDF-Trine-Serializer-Extras".

Thrilled you're working on this, but not sure about the name... Extras sounds funny to my ears. I understand the desire to have them in a group instead of having 10 new packages, but seems somewhat un-CPAN-like. Anyone have thoughts on better names? Or maybe I could convince you to have separate packages and we could introduce a CPAN Bundle for extra parsers/serializers? Or just bundle it all into RDF::Trine (if they don't introduce lots of new dependancies)? Dunno...

> As this stomps all over Greg's RDF::Trine namespace, I thought it might
> be wise to check he's OK with this. Greg?

Yeah, I'm OK with this so long as we continue to have some level of coordination on package names.

> I do also have plans to add some RDF signature validation stuff, plus
> perhaps an easy SPARQL endpoint implemntation.

There's an RDF::Endpoint in my perlrdf git repo:

http://github.com/kasei/perlrdf/tree/master/RDF-Endpoint/

It's got both a mod_perl handler and a CGI variant (though the CGI variant might have experienced some bitrot). I've been meaning to package that up and release it, in addition to the RDF::LinkedData package I mentioned a couple of weeks ago. It might be worth looking at before you re-implement something.

Who's controlling the perlrdf.org webpage at this point? It would be great to put up a wiki so that we could keep a projects and todo list somewhere.

.greg



More information about the Dev mailing list