CPAN as linked data

Gregory Williams greg at evilfunhouse.com
Wed Jul 21 22:03:19 CEST 2010


On Jul 20, 2010, at 3:45 PM, Toby Inkster wrote:

> On Tue, 20 Jul 2010 12:12:04 -0400
> Gregory Williams <greg at evilfunhouse.com> wrote:
> 
>> This is very strange. It's somewhat hard to debug because some of the
>> explorative queries I'm trying are timing out (I'd love to be able to
>> poke around how you have this set up so I can try to fix the
>> performance issues)
> 
> I've got a dump here - it's 14MB compressed; not sure how big
> uncompressed:
> 
> http://ontologi.es/cpan-data/dumps/cpan-data-latest.nt.bz2
> 
> It's in a PostgreSQL database with just whatever default indices
> RDF::Trine::Store::DBI::init sets up.

I suspect the problem has something to do with the PostgreSQL backend (since I basically don't test that anymore). I'm trying to track down the issue, but ran into another problem. How did you generate that N-Triples file? There's invalid data in there due to version numbers:

<http://purl.org/NET/cpan-uri/dist/Acme-Magic-Pony/v_0-03> <http://purl.org/NET/cpan-uri/terms#requires> <http://purl.org/NET/cpan-uri/module/CPAN/v_>= 1-93> <http://purl.org/NET/cpan-uri/graph/backpan?file=authors/id/J/JL/JLAVALLEE/Acme-Magic-Pony-0.03.meta> .

Somewhere along the line, the ">= " need to get escaped but wasn't. Do you have a sense of whether that's my problem or yours?

.g



More information about the Dev mailing list