[MKSearch-dev] Triple output to check
Phil Shaw
phil at mkdoc.com
Fri Jan 7 10:50:26 GMT 2005
Chris,
I've finished what I have called the MetaTripleWriterPlugin for
JSpider, which writes an N-Triple style file for every (X)HTML
document that is retrieved. It's not quite N-Triple standard because
it does not escape non-ASCII characters -- it's primarily to check
the metadata is parsed correctly, URIs expanded and bNodes are okay.
I'm pretty happy it's doing the right thing in all other respects,
but I wonder if you could take a look at these files to double check.
They contain test cases from all the DC elements and terms schemas.
They are generated from these documents:
http://test.mksearch.mkdoc.org/meta/dc-All.html
http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html
http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html
Thanks.
Phil
-------------- next part --------------
# Metadata triples for http://test.mksearch.mkdoc.org/meta/dc-All.html
<http://test.mksearch.mkdoc.org/meta/dc-All.html> "generator" "HTML Tidy, see www.w3.org" .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/contributor> "Shaw, Philip" .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/coverage> "World" .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/creator> "Shaw, Philip" .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/date> "2004-12-22" .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/description> "A test case document for the Dublin Core metadata element Description" .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/format> "text/html" .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/identifier> "ID:394" .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/identifier> <http://test.mksearch.mkdoc.org/meta/dc-IdentifierURI.html> .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/language> "en-GB" .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/publisher> "MKDoc Ltd." .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/relation> "XHTML meta element test directory" .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/relation> <http://test.mksearch.mkdoc.org/meta/index.html> .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/rights> "Copyright © 2004 MKDoc Ltd." .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/rights> "Copyright © 2004 MKDoc Ltd." .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/rights> <http://test.mksearch.mkdoc.org/meta/dc-RightsURI.shtml#copyright> .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/source> "Dublin Core Metadata Element Set, Version 1.1: Reference Description" .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/source> <http://dublincore.org/documents/2003/06/02/dces/> .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/subject> "Test document" .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/title> "Single meta element for DC.Title" .
<http://test.mksearch.mkdoc.org/meta/dc-All.html> <http://purl.org/dc/elements/1.1/type> "Text" .
-------------- next part --------------
# Metadata triples for http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> "generator" "HTML Tidy, see www.w3.org" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc1 .
_:mkdoc1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/Box> .
_:mkdoc1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "northlimit=-21.3; southlimit=-21.4; westlimit=139.8; eastlimit=139.9; uplimit=400; downlimit=-100; name=Duchess copper mine" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc2 .
_:mkdoc2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/Period> .
_:mkdoc2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "start=2004-10-11;end=2006-01-11;scheme=W3CDTF" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc3 .
_:mkdoc3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/Point> .
_:mkdoc3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "east=148.26218; north=-36.45746; elevation=2228; name=Mt. Kosciusko" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc4 .
_:mkdoc4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/ISO3166> .
_:mkdoc4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "826" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc5 .
_:mkdoc5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/TGN> .
_:mkdoc5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "World" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc6 .
_:mkdoc6 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/W3CDTF> .
_:mkdoc6 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "2004-12-23" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/date> _:mkdoc7 .
_:mkdoc7 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/Period> .
_:mkdoc7 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "start=2004-10-11;end=2006-01-11;scheme=W3CDTF" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/date> _:mkdoc8 .
_:mkdoc8 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/W3CDTF> .
_:mkdoc8 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "2004-12-23" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/format> _:mkdoc9 .
_:mkdoc9 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/IMT> .
_:mkdoc9 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "text/html" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/identifier> _:mkdoc10 .
_:mkdoc10 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/URI> .
_:mkdoc10 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> <http://test.mksearch.mkdoc.org/meta/scheme/dc-new-IdentifierURI.html> .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/language> _:mkdoc11 .
_:mkdoc11 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/ISO639-2> .
_:mkdoc11 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "eng" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/language> _:mkdoc12 .
_:mkdoc12 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/RFC1766> .
_:mkdoc12 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "en-GB" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/relation> _:mkdoc13 .
_:mkdoc13 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/URI> .
_:mkdoc13 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> <http://test.mksearch.mkdoc.org/meta/scheme/index.html> .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/rights> _:mkdoc14 .
_:mkdoc14 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/URI> .
_:mkdoc14 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> <http://test.mksearch.mkdoc.org/meta/dc-RightsURI.shtml#copyright> .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/source> _:mkdoc15 .
_:mkdoc15 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/URI> .
_:mkdoc15 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> <http://dublincore.org/documents/2000/07/11/dcmes-qualifiers/> .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/subject> _:mkdoc16 .
_:mkdoc16 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/DDC> .
_:mkdoc16 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "006" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/subject> _:mkdoc17 .
_:mkdoc17 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/LCC> .
_:mkdoc17 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "KBR44.6.C66" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/subject> _:mkdoc18 .
_:mkdoc18 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/LCSH> .
_:mkdoc18 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Genealogy" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/subject> _:mkdoc19 .
_:mkdoc19 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/UDC> .
_:mkdoc19 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "647" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-new-All.html> <http://purl.org/dc/elements/1.1/type> _:mkdoc20 .
_:mkdoc20 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/DCMIType> .
_:mkdoc20 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Text" .
-------------- next part --------------
# Metadata triples for http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> "generator" "HTML Tidy, see www.w3.org" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc1 .
_:mkdoc1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/Box> .
_:mkdoc1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "northlimit=-21.3; southlimit=-21.4; westlimit=139.8; eastlimit=139.9; uplimit=400; downlimit=-100; name=Duchess copper mine" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc2 .
_:mkdoc2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/Point> .
_:mkdoc2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "east=148.26218; north=-36.45746; elevation=2228; name=Mt. Kosciusko" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc3 .
_:mkdoc3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/Period> .
_:mkdoc3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "start=2004-10-11;end=2006-01-11;scheme=W3CDTF" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc4 .
_:mkdoc4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/ISO3166> .
_:mkdoc4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "826" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc5 .
_:mkdoc5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/TGN> .
_:mkdoc5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "World" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/coverage> _:mkdoc6 .
_:mkdoc6 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/W3CDTF> .
_:mkdoc6 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "2004-12-23" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/date> _:mkdoc7 .
_:mkdoc7 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/Period> .
_:mkdoc7 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "start=2004-10-11;end=2006-01-11;scheme=W3CDTF" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/date> _:mkdoc8 .
_:mkdoc8 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/W3CDTF> .
_:mkdoc8 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "2004-12-23" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/format> _:mkdoc9 .
_:mkdoc9 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/IMT> .
_:mkdoc9 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Text" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/identifier> _:mkdoc10 .
_:mkdoc10 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/URI> .
_:mkdoc10 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> <http://test.mksearch.mkdoc.org/meta/scheme/dc-old-IdentifierURI.html> .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/language> _:mkdoc11 .
_:mkdoc11 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/ISO639-2> .
_:mkdoc11 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "eng" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/language> _:mkdoc12 .
_:mkdoc12 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/RFC1766> .
_:mkdoc12 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "en-GB" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/relation> _:mkdoc13 .
_:mkdoc13 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/URI> .
_:mkdoc13 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> <http://test.mksearch.mkdoc.org/meta/scheme/index.html> .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/rights> _:mkdoc14 .
_:mkdoc14 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/URI> .
_:mkdoc14 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> <http://test.mksearch.mkdoc.org/meta/dc-RightsURI.shtml#copyright> .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/source> _:mkdoc15 .
_:mkdoc15 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/URI> .
_:mkdoc15 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> <http://dublincore.org/documents/2000/07/11/dcmes-qualifiers/> .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/subject> _:mkdoc16 .
_:mkdoc16 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/DDC> .
_:mkdoc16 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "006" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/subject> _:mkdoc17 .
_:mkdoc17 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/LCC> .
_:mkdoc17 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "KBR44.6.C66" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/subject> _:mkdoc18 .
_:mkdoc18 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/LCSH> .
_:mkdoc18 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Genealogy" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/subject> _:mkdoc19 .
_:mkdoc19 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/MESH> .
_:mkdoc19 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Angiostatins" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/subject> _:mkdoc20 .
_:mkdoc20 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/UDC> .
_:mkdoc20 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "647" .
<http://test.mksearch.mkdoc.org/meta/scheme/dc-old-All.html> <http://purl.org/dc/elements/1.1/type> _:mkdoc21 .
_:mkdoc21 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/terms/DCMIType> .
_:mkdoc21 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Text" .
More information about the MKSearch-dev
mailing list