[MKSearch-dev] Week 24 & 25 round up

Phil Shaw phil at mkdoc.com
Fri Apr 1 11:09:37 BST 2005


The system now has fully tested UK e-GMS and Dublin Core metadata 
indexing. It can write the metadata as an N-Triples file for each 
document or store RDF statements in a file-based Sesame repository. 
The whole code base now compiles with GCJ and Sun javac and the 
indexer runs with Windows java and GNU/Linux gij.

Tuesday 22
~~~~~~~~~~
Updated the Sesame library source to the latest CVS version, in which 
the development team had cleaned up all wildcard import statements, 
to work around a GCJ bug. They had overlooked one file, so reported 
back on this with some other comments.

Made a final draft presentation for the BECTA meeting, copy at:

<URL:https://svn.mkdoc.com/mksearch/doc/presentations/BECTA%202005-03-
23.ppt>

Wednesday 23
~~~~~~~~~~~~
Presentation at BECTA in Birmingham went well, there seemed to be 
quite strong interest in what MKSearch has to offer. People grasped 
the consequences of exclusively schema-based indexing readily and the 
site-based configuration options for JSpider were attractive.

Thursday 24
~~~~~~~~~~~
Imported the latest Sesame source from CVS with final corrections to 
the wildcard imports and added the Apache SOAP and servlet upload 
libraries to compile. Adjusted compilation scripts as necessary and 
added a new jar-library.sh script to compile and archive all source 
libraries at once. Updated the documentation notes on the Web site to 
reflect the more limited source code amendments for Sesame.

Finally, cleared out the JavaDoc directory to re-generate a set for 
the Subversion Web site that has the right content types all round. 

MKSearch <URL:https://svn.mkdoc.com/mksearch/doc/javadoc/index.html> 
GNU JAXP 
<URL:https://svn.mkdoc.com/mksearch/doc/javadoc/jaxp/index.html>
JSpider 
<URL:https://svn.mkdoc.com/mksearch/doc/javadoc/jspider/index.html>
JTidy 
<URL:https://svn.mkdoc.com/mksearch/doc/javadoc/jtidy/index.html>
GNU Servlet API 
<URL:https://svn.mkdoc.com/mksearch/doc/javadoc/servlet/index.html>
Sesame 
<URL:https://svn.mkdoc.com/mksearch/doc/javadoc/sesame/index.html>

Tuesday 29
~~~~~~~~~~
Refactored the MKSearch plugin hierarchy and added a new 
XhtmlStoreWriterPlugin implementation, with coverage tests and mock 
objects. Amended AbstractRdfContentHandler to step through custom 
schemas without prefixes as a final fallback if no dot separator is 
found. Added custom schema map constructors to MetaStoreWriter and 
LinkStoreWriter and updated unit tests for coverage as necessary. 

GCJ-compiled and Sun javac-compiled code now running under Sun java.

Wednesday 30
~~~~~~~~~~~~
Found an inconsistency with the way GNU fastjar creates lowercase 
meta-inf paths in JAR files which prevents the service providers 
property configuration in JAXP from loading any SAX driver. Amended 
the JAR command line to suppress the default behaviour, which allowed 
the MKSearch indexer to run under GNU/Linux. Made a number of 
indexing runs against the static test site to confirm and wrote up 
some notes on how to build and run MKSearch with GCJ:

<URL:http://www.mksearch.mkdoc.org/howto/build-mksearch-with-gcj/>
<URL:http://www.mksearch.mkdoc.org/howto/run-the-mksearch-indexer/>

Thursday 31
~~~~~~~~~~~
Created a first draft outline of a store manager interface, or 
"Checker" component, and a ProspectiveSubjectManagerPlugin to purge 
the triple store during indexing.



More information about the MKSearch-dev mailing list