RDF Parser / Serializer plans

Toby Inkster mail at tobyinkster.co.uk
Thu Dec 17 13:09:05 CET 2009


This is a quick note to let people know about my plans for releasing
parser and serializer packages.

I have a couple of CPAN packages that parse various file formats into
RDF::Trine models - namely RDF::RDFa::Parser and XRD::Parser.
RDF::RDFa::Parser started off as completely independent of RDF::Trine,
and as a result it has a different API to RDF::Trine's own parser
modules. XRD::Parser simply followed the same API as RDF::RDFa::Parser.
I plan on writing wrapper modules to provide an RDF::Trine-compatiable
interface for them. These will be called:

	RDF::Trine::Parser::RDFa
	RDF::Trine::Parser::XRD

I'll release them together in a CPAN distribution called
"RDF-Trine-Parser-Extras".

Once version 0.01 of this is out, I plan on porting a large collection
of parsers I wrote as part of the Swignition project to RDF::Trine.
These will be added to the RDF-Trine-Parser-Extras package one by one as
they are ready. These include:

	* TriX, including support for XSLT stylesheets
	* GRDDL (XSLT and RDF-EASE transformations)
	* jsonGRDDL
	* eRDF
	* OpenURL COinS
	* Microformats: adr, figure, geo, hAtom, hAudio, hCalendar,
		hCard, hMeasure, hRecipe, hResume, hReview,
		rel-enclosure, rel-license, rel-tag, species, XFN,
		xoxo

There will also be an RDF::Trine::Parser::HTML module which parses an
(X)HTML file and runs a number of relevant parsers on it (RDFa,
Microformats, eRDF, GRDDL, etc).

And RDF::Trine::Parser::Guess will try to guess the right parser to use
automagically.

Lastly, as a separate distribution, RDF::Trine::Sponger (very OpenLink
Virtuoso name - anybody got a better suggestion?) will automate the job
of fetching a URL, handing it off to the parser, and crawling
rdfs:seeAlso links to a specified depth (0 by default).

In terms of serializers I plan on doing a pretty similar thing. I've got
serializers for:

	* Pretty-printed RDF/XML.
	* TriX
	* Pretty-printed Turtle and N3

Again I'll do these in a distribution called something like
"RDF-Trine-Serializer-Extras".

I also have - I don't know the best name for them - perhaps "writers" or
"exporters". The idea is that these take an RDF graph and export them in
a format, but will ignore triples they deem irrelevant, and are
certainly not round-trip safe. These include:

	vCard
	jCard (possibly coercable into "Portable Contacts")
	iCalendar
	KML
	M3U
	Atom

These will likely take a while longer as they'll need fairly substantial
rewrites. They'll eventually end up in a CPAN distribution called
something like "RDF-Trine-Exporter".

As this stomps all over Greg's RDF::Trine namespace, I thought it might
be wise to check he's OK with this. Greg?

I do also have plans to add some RDF signature validation stuff, plus
perhaps an easy SPARQL endpoint implemntation.

-- 
Toby A Inkster
<mailto:mail at tobyinkster.co.uk>
<http://tobyinkster.co.uk>


More information about the Dev mailing list