[MKSearch-dev] Week 21 round up
Phil Shaw
phil at mkdoc.com
Wed Mar 2 19:22:45 GMT 2005
I finished work on the components for a general purpose XHTML indexer
and started bringing the crawler up to date. Spent lots more time
picking through government metadata specifications and I'm pleased to
say I'm over the worst of it. Still some gremlins to sort out.
Best regards,
Phil
Monday
~~~~~~
Created a test suite for the new UKeGMS class, which required some
additions to the Schema interface and updates to the
DublinCoreElements and DublinCoreTerms schemas. Testing identified
some minor oversights in the Dublin Core schemas.
Tuesday
~~~~~~~
Added tests for the new Schema getPrefix methods. Re-factored many
SAX and JSpider classes to share more common code and create a new
general purpose XhtmlTripleWriterPlugin. Added test cases for
XhtmlMetadataFilter and an addSchema method to the RDFHandler types.
Also made name changes to various SAX and JSpider classes to simplify.
Wednesday
~~~~~~~~~
Added the Hansel test coverage tool and supporting BCEL package to
the optional library directory so that it is properly part of the
project build system. Updated the JSpider "triple" configuration to
the new general purpose XhtmlTripleWriterPlugin with custom UKeGMS
Schema and tested -- required a few minor tweaks and adjustments.
Added a target to the Ant build script to create a GNU Servlet API
JAR and added it to the library JAR target dependencies. Finally,
created a set of 70 e-GMS test documents, ran the crawler over them
and extracted no metadata! Something to look at next week...
More information about the MKSearch-dev
mailing list