[MKSearch-dev] Triple output to check

Phil Shaw phil at mkdoc.com
Fri Jan 7 10:50:26 GMT 2005


Chris,

I've finished what I have called the MetaTripleWriterPlugin for 
JSpider, which writes an N-Triple style file for every (X)HTML 
document that is retrieved. It's not quite N-Triple standard because 
it does not escape non-ASCII characters -- it's primarily to check 
the metadata is parsed correctly, URIs expanded and bNodes are okay.

I'm pretty happy it's doing the right thing in all other respects, 
but I wonder if you could take a look at these files to double check. 
They contain test cases from all the DC elements and terms schemas.

They are generated from these documents:

http://test.mksearch.mkdoc.org/meta/dc-All.html
http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html
http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html

Thanks.

Phil



-------------- next part --------------
# Metadata triples for http://test.mksearch.mkdoc.org/meta/dc-All.html
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> "generator" "HTML Tidy, see www.w3.org" .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/contributor> "Shaw, Philip" .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/coverage> "World" .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/creator> "Shaw, Philip" .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/date> "2004-12-22" .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/description> "A test case document for the Dublin Core metadata element Description" .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/format> "text/html" .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/identifier> "ID:394" .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/identifier> <http://test.mksearch.mkdoc.org/meta/dc-IdentifierURI.html> .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/language> "en-GB" .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/publisher> "MKDoc Ltd." .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/relation> "XHTML meta element test directory" .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/relation> <http://test.mksearch.mkdoc.org/meta/index.html> .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/rights> "Copyright © 2004 MKDoc Ltd." .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/rights> "Copyright © 2004 MKDoc Ltd." .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/rights> <http://test.mksearch.mkdoc.org/meta/dc-RightsURI.shtml#copyright> .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/source> "Dublin Core Metadata Element Set, Version 1.1: Reference Description" .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/source> <http://dublincore.org/documents/2003/06/02/dces/> .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/subject> "Test document" .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/title> "Single meta element for DC.Title" .
 <http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/type> "Text" .

-------------- next part --------------
# Metadata triples for http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> "generator" "HTML Tidy, see www.w3.org" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc1 .
 _:mkdoc1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/Box> .
 _:mkdoc1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "northlimit=-21.3; southlimit=-21.4; westlimit=139.8; eastlimit=139.9; uplimit=400; downlimit=-100; name=Duchess copper mine" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc2 .
 _:mkdoc2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/Period> .
 _:mkdoc2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "start=2004-10-11;end=2006-01-11;scheme=W3CDTF" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc3 .
 _:mkdoc3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/Point> .
 _:mkdoc3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "east=148.26218; north=-36.45746; elevation=2228; name=Mt. Kosciusko" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc4 .
 _:mkdoc4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/ISO3166> .
 _:mkdoc4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "826" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc5 .
 _:mkdoc5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/TGN> .
 _:mkdoc5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "World" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc6 .
 _:mkdoc6 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/W3CDTF> .
 _:mkdoc6 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "2004-12-23" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/date> _:mkdoc7 .
 _:mkdoc7 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/Period> .
 _:mkdoc7 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "start=2004-10-11;end=2006-01-11;scheme=W3CDTF" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/date> _:mkdoc8 .
 _:mkdoc8 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/W3CDTF> .
 _:mkdoc8 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "2004-12-23" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/format> _:mkdoc9 .
 _:mkdoc9 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/IMT> .
 _:mkdoc9 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "text/html" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/identifier> _:mkdoc10 .
 _:mkdoc10 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/URI> .
 _:mkdoc10 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-IdentifierURI.html> .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/language> _:mkdoc11 .
 _:mkdoc11 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/ISO639-2> .
 _:mkdoc11 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "eng" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/language> _:mkdoc12 .
 _:mkdoc12 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/RFC1766> .
 _:mkdoc12 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "en-GB" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/relation> _:mkdoc13 .
 _:mkdoc13 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/URI> .
 _:mkdoc13 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> <http://test.mksearch.mkdoc.org/meta/scheme/index.html> .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/rights> _:mkdoc14 .
 _:mkdoc14 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/URI> .
 _:mkdoc14 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> <http://test.mksearch.mkdoc.org/meta/dc-RightsURI.shtml#copyright> .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/source> _:mkdoc15 .
 _:mkdoc15 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/URI> .
 _:mkdoc15 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> <http://dublincore.org/documents/2000/07/11/dcmes-qualifiers/> .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/subject> _:mkdoc16 .
 _:mkdoc16 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/DDC> .
 _:mkdoc16 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "006" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/subject> _:mkdoc17 .
 _:mkdoc17 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/LCC> .
 _:mkdoc17 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "KBR44.6.C66" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/subject> _:mkdoc18 .
 _:mkdoc18 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/LCSH> .
 _:mkdoc18 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Genealogy" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/subject> _:mkdoc19 .
 _:mkdoc19 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/UDC> .
 _:mkdoc19 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "647" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/type> _:mkdoc20 .
 _:mkdoc20 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/DCMIType> .
 _:mkdoc20 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Text" .

-------------- next part --------------
# Metadata triples for http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> "generator" "HTML Tidy, see www.w3.org" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc1 .
 _:mkdoc1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/Box> .
 _:mkdoc1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "northlimit=-21.3; southlimit=-21.4; westlimit=139.8; eastlimit=139.9; uplimit=400; downlimit=-100; name=Duchess copper mine" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc2 .
 _:mkdoc2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/Point> .
 _:mkdoc2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "east=148.26218; north=-36.45746; elevation=2228; name=Mt. Kosciusko" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc3 .
 _:mkdoc3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/Period> .
 _:mkdoc3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "start=2004-10-11;end=2006-01-11;scheme=W3CDTF" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc4 .
 _:mkdoc4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/ISO3166> .
 _:mkdoc4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "826" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc5 .
 _:mkdoc5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/TGN> .
 _:mkdoc5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "World" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc6 .
 _:mkdoc6 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/W3CDTF> .
 _:mkdoc6 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "2004-12-23" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/date> _:mkdoc7 .
 _:mkdoc7 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/Period> .
 _:mkdoc7 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "start=2004-10-11;end=2006-01-11;scheme=W3CDTF" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/date> _:mkdoc8 .
 _:mkdoc8 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/W3CDTF> .
 _:mkdoc8 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "2004-12-23" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/format> _:mkdoc9 .
 _:mkdoc9 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/IMT> .
 _:mkdoc9 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Text" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/identifier> _:mkdoc10 .
 _:mkdoc10 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/URI> .
 _:mkdoc10 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-IdentifierURI.html> .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/language> _:mkdoc11 .
 _:mkdoc11 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/ISO639-2> .
 _:mkdoc11 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "eng" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/language> _:mkdoc12 .
 _:mkdoc12 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/RFC1766> .
 _:mkdoc12 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "en-GB" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/relation> _:mkdoc13 .
 _:mkdoc13 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/URI> .
 _:mkdoc13 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> <http://test.mksearch.mkdoc.org/meta/scheme/index.html> .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/rights> _:mkdoc14 .
 _:mkdoc14 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/URI> .
 _:mkdoc14 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> <http://test.mksearch.mkdoc.org/meta/dc-RightsURI.shtml#copyright> .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/source> _:mkdoc15 .
 _:mkdoc15 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/URI> .
 _:mkdoc15 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> <http://dublincore.org/documents/2000/07/11/dcmes-qualifiers/> .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/subject> _:mkdoc16 .
 _:mkdoc16 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/DDC> .
 _:mkdoc16 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "006" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/subject> _:mkdoc17 .
 _:mkdoc17 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/LCC> .
 _:mkdoc17 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "KBR44.6.C66" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/subject> _:mkdoc18 .
 _:mkdoc18 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/LCSH> .
 _:mkdoc18 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Genealogy" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/subject> _:mkdoc19 .
 _:mkdoc19 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/MESH> .
 _:mkdoc19 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Angiostatins" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/subject> _:mkdoc20 .
 _:mkdoc20 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/UDC> .
 _:mkdoc20 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "647" .
 <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/type> _:mkdoc21 .
 _:mkdoc21 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/DCMIType> .
 _:mkdoc21 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Text" .



More information about the MKSearch-dev mailing list