[MKSearch-dev] Week 24 & 25 round up
Phil Shaw
phil at mkdoc.com
Fri Apr 1 11:09:37 BST 2005
The system now has fully tested UK e-GMS and Dublin Core metadata
indexing. It can write the metadata as an N-Triples file for each
document or store RDF statements in a file-based Sesame repository.
The whole code base now compiles with GCJ and Sun javac and the
indexer runs with Windows java and GNU/Linux gij.
Tuesday 22
~~~~~~~~~~
Updated the Sesame library source to the latest CVS version, in which
the development team had cleaned up all wildcard import statements,
to work around a GCJ bug. They had overlooked one file, so reported
back on this with some other comments.
Made a final draft presentation for the BECTA meeting, copy at:
<URL:https://svn.mkdoc.com/mksearch/doc/presentations/BECTA%202005-03-
23.ppt>
Wednesday 23
~~~~~~~~~~~~
Presentation at BECTA in Birmingham went well, there seemed to be
quite strong interest in what MKSearch has to offer. People grasped
the consequences of exclusively schema-based indexing readily and the
site-based configuration options for JSpider were attractive.
Thursday 24
~~~~~~~~~~~
Imported the latest Sesame source from CVS with final corrections to
the wildcard imports and added the Apache SOAP and servlet upload
libraries to compile. Adjusted compilation scripts as necessary and
added a new jar-library.sh script to compile and archive all source
libraries at once. Updated the documentation notes on the Web site to
reflect the more limited source code amendments for Sesame.
Finally, cleared out the JavaDoc directory to re-generate a set for
the Subversion Web site that has the right content types all round.
MKSearch <URL:https://svn.mkdoc.com/mksearch/doc/javadoc/index.html>
GNU JAXP
<URL:https://svn.mkdoc.com/mksearch/doc/javadoc/jaxp/index.html>
JSpider
<URL:https://svn.mkdoc.com/mksearch/doc/javadoc/jspider/index.html>
JTidy
<URL:https://svn.mkdoc.com/mksearch/doc/javadoc/jtidy/index.html>
GNU Servlet API
<URL:https://svn.mkdoc.com/mksearch/doc/javadoc/servlet/index.html>
Sesame
<URL:https://svn.mkdoc.com/mksearch/doc/javadoc/sesame/index.html>
Tuesday 29
~~~~~~~~~~
Refactored the MKSearch plugin hierarchy and added a new
XhtmlStoreWriterPlugin implementation, with coverage tests and mock
objects. Amended AbstractRdfContentHandler to step through custom
schemas without prefixes as a final fallback if no dot separator is
found. Added custom schema map constructors to MetaStoreWriter and
LinkStoreWriter and updated unit tests for coverage as necessary.
GCJ-compiled and Sun javac-compiled code now running under Sun java.
Wednesday 30
~~~~~~~~~~~~
Found an inconsistency with the way GNU fastjar creates lowercase
meta-inf paths in JAR files which prevents the service providers
property configuration in JAXP from loading any SAX driver. Amended
the JAR command line to suppress the default behaviour, which allowed
the MKSearch indexer to run under GNU/Linux. Made a number of
indexing runs against the static test site to confirm and wrote up
some notes on how to build and run MKSearch with GCJ:
<URL:http://www.mksearch.mkdoc.org/howto/build-mksearch-with-gcj/>
<URL:http://www.mksearch.mkdoc.org/howto/run-the-mksearch-indexer/>
Thursday 31
~~~~~~~~~~~
Created a first draft outline of a store manager interface, or
"Checker" component, and a ProspectiveSubjectManagerPlugin to purge
the triple store during indexing.
More information about the MKSearch-dev
mailing list