[MKSearch-dev] Week 57-61 round up
Phil Shaw
phil at mkdoc.com
Tue Dec 13 16:58:40 GMT 2005
It's been a while since I did a summary of progress, so this is a
long post.
Soon after the beta 1 release I started contacting government Web
managers about their Information Asset Register records with a view
to creating an MKSearch alternative to the Inforoute service. It was
surprising how difficult it was to gather the relevant URLs, which I
have added to the test site as a convenient start point for indexing.
http://test.mksearch.mkdoc.org/iar/index.html
I have created an XMLReader for IAR records that emits SAX events as
if the data were marked up as XHTML. That means it can be dropped in
place to work with the existing content handlers for DC and e-GMS
metadata.
I created a basic file system spider to test the IAR records. This
runs like lightning, but is only designed to process plain text IAR
records so far.
I also added a store browse interface to the to the query front end
(detailed in a previous post).
Best regards,
Phil
14 to 18 November
~~~~~~~~~~~~~~~~~~~
Created a new StoreBrowseTag to provide directory-style browsing of
store content by schema properties. Added a new
getString(propertyName, row) method to the QueryResult interface and
updated all concrete types as necessary: StoreQueryResult,
NullQueryResult and MockQueryResult.
Added a StoreBrowseTag to the MKSearch application Tag Library
Descriptor and demonstration home page. Added a set of CSS rules to
the layout style sheet to float browse indexes vertically down the
page, depending on how many columns there are: .Whole, .Half, .Third,
.Quarter, .Fifth.
Refactored HttpQuery and QueryResultTag to introduce new
AbstractStoreQueryServlet and AbstractQueryTag superclasses. Added
the new AbstractStoreQueryServlet to the coverage test suite for
HttpQuery. Added AbstractQueryTag to the coverage test suite for
QueryResultTag. Required some additions to the QueryResultTagTest to
complete. Also modified QueryBuilderTag to inherit.
Checked in some draft SeRQL queries used in the development of
StoreBrowseTag and posted some screen shots.
5 to 9 December
~~~~~~~~~~~~~~~
Completed the first working draft of the UKIarReader class and
created a TextFileFilter for processing files locally.
Checked in a new IAR index for the test site as a start point for
indexing text records. Also checked in local copies of the IAR
records used for testing.
Created a working draft FileSpider implementation that recurses
through directory structures parsing files. Required a new
FileApplicationContext to pass to the relevant StoreManager type.
Made some minor modifications to insert a space in place of a new
line in field content. Also added a conditional check that both field
name and content are present before issuing the SAX events for the
data.
Amended the AbstractFileStoreManager to recognise file protocol URLs
when issuing storage file references for the FileSpider
Added a new setParameter method to the ApplicationContext interface.
Added a parameter map and an implementation of the new method to
AbstractApplicationContext. Adapted the PluginApplicationContext and
ServletApplicationContext classes to look-up the parameter map before
checking the primary data source for their respective types.
Amended the TextFileFilter to ignore Subversion working copy
directories, .svn in file paths.
Created a batch script to run the new draft file spider application.
13 December
~~~~~~~~~~~
Added null argument checks to AbstractApplicationContext and extended
JUnit tests to cover the latest mapped parameter methods. Added tests
to PluginApplicationContextTest and ServletApplicationContextTest to
cover the mapped parameter handling too. Removed
AbstractApplicationContext tests from the PluginApplicationContext
coverage test suite.
Switched the Subversion base for the indexing test site to the new
trunk version of the MKSearch source.
--
MKSearch (beta)
http://www.mksearch.mkdoc.org/
Free, open source metadata search engine with RDF storage and query.
More information about the MKSearch-dev
mailing list