[MKSearch-dev] Week 48 and 49 round up

Phil Shaw phil at mkdoc.com
Fri Sep 16 13:56:42 BST 2005


With a bit of help from the Sesame developers, I now have the query 
system working. In principle, it will combine any number of Dublin 
Core predicates selected by the query builder, though I haven't 
tested it to destruction. It now makes sensible distinctions between 
URI and other queries. It will also match documents whether the 
primary data had an explicit encoding scheme or not; for example, it 
makes a union of this:

{targetUri} dc:identifier {dcIdentifier}

... and this:

{targetUri} dc:identifier {} rdf:value {dcIdentifier}

Anyway, that's the last major task before packaging this up for a 
beta release. I want to look again at JSpider configuration, and I 
would like to get the whole thing running on FC4, so it may be a few 
weeks yet.

Warning, this is a long report...

Best regards,

Phil



Tuesday 6 September
~~~~~~~~~~~~~~~~~~~
Had to remove a whole section of the GNU JAXP release because it 
contained numerous coding errors, which have since been removed from 
the merged Classpath version. Completed and stabilised JAXP 
compilation on Cygwin, Windows and FC3.

Removed build directory from Ant build classpath. 

Wednesday 7 September
~~~~~~~~~~~~~~~~~~~~~~
Added execute permissions to all bash scripts in Subversion. Worked 
around a further GCJ compilation error in Sesame under FC3.

Thursday 8 September 
~~~~~~~~~~~~~~~~~~~~ 
Added a statement to the AbstractRepositoryManager class to create 
any parent directories for a repository storage file. Adjusted the 
test case as necessary. Fixes the scenario where users have unpacked 
a source distribution or checked out the Subversion source, which 
does not include an output directory.  

Added a getAttributeValue method to AbstractContentHandler to trim 
leading and trailing white space from attribute values during 
indexing. Modified all the meta/link writers to acquire their values 
through the new method and added tests for these cases.  

Monday 12 September 
~~~~~~~~~~~~~~~~~~~~ 
Adapted AbstractResultRenderer to implement ResultRenderer, to access 
the FULL constant. Introduced two-argument versions of 
renderQueryResult to the interface and abstract superclass that pass 
through calls with a default FULL mode argument.  

Introduced an appendTerm method to the re-instate the (hex-encoded) 
quotes around query phrases when reconstructing the query URL. Fixes 
phrase search page navigation problems and RSS channel URL.  

Introduced a handleDataQuery method to deal with RSS and other "pure 
data" query types through the QueryFactory and ResultRenderer 
interfaces. Any new data output types can now be handled simply by 
adapting the QueryFactory.  

Changed the QueryBuilderTag attributes to separate queryUrl and 
builderUrl values. Updated relevant JSP pages and tag library 
definition. Temporarily removed the RDF data output option from the 
query form and added a results per page selector.  

Changed the Tomcat server configuration to listen on port 80.  

Tuesday 13 September
 ~~~~~~~~~~~~~~~~~~~~ 
Added extended query attribute to query result tags, tested and 
temporarily set to "false" pending further development. Added the new 
attribute to the tag library descriptor. Removed RSS result page from 
server configuration.  

Added an extended search parameter to AbstractQueryBuilder, with 
conditional handling in key methods. Also added namespace 
declarations to the addTitleExpression and addSummaryExpression 
methods to fix case where the user's query does not include DC 
Element predicates. Changed combination of conditional statements to 
AND, rather than OR.  

Added a new constructor to ServletQueryBuilder to request a extended 
query capability. Applied the extended query parameters to the 
HttpQuery class and QueryResultTag.  

Added a hasEncodingScheme method to the SchemaProperty interface in 
preparation to handle extended queries better. Applied the new method 
through the AbstractSchemaProperty class and added constructors to 
the three core implementations to carry-through: DCElementProperty, 
DCTermProperty, UKeGMSProperty. Updated the DublinCoreElements and 
DublinCoreTerms classes to specify the properties that may take an 
encoding scheme, including conformsTo, created, format, hasPart, 
hasVersion, identifier, isVersionOf, language, modified, references 
and type.  

Amended the RepositoryQuery class to take a file input for testing 
SeRQL queries more conveniently. Added more specific error reporting 
to help debug servlet based query echo. Removed leading spaces from 
the subject fields of the working MKSearch RDF/XML index.  

Wednesday 14 September 
~~~~~~~~~~~~~~~~~~~~~~ 
Made various draft SeRQL queries to test changes the 
AbstractQueryBuilder.

Changed the implementation of AbstractQueryBuilder to use a map of 
SchemaProperty to List of query terms. This partly simplifies query 
construction and also allows union joins of select statements with 
extended bNode path expressions. Removed much of the former OR-query 
based extended query mechanisms and extended the addCondition method 
to handle the new construction scheme. Removed the original toString 
method in favour of a getSeRQLQuery(boolean) method.  

The new getSeRQLQuery method uses some existing methods and some new 
methods to iterate through the SchemaProperty map to create the 
query. A hasExtendableProperties checks whether a union query is 
necessary despite such a request; appendPrimaryExpressions adds basic 
path expressions; appendSecondaryExpressions adds extended bNode path 
expressions; appendConditions forms AND conditions between 
predicates, and OR conditions amongst multiple query terms on the 
same predicate. Various loops and counts add parentheses, commas and 
other syntax where necessary.  

Adapted ServletQueryBuilder to use the new getSeRQLQuery method. Set 
the extended properties of the JSP QueryResultTag and HttpQuery 
servlet to true to activate extended bNode union queries. Made 
various amendments and additions to the SeRQL test scripts to plan 
the new query scheme. 

Thursday 15 September
~~~~~~~~~~~~~~~~~~~~~
Added a hasUriEncodingScheme to the SchemaProperty interface and 
implemented this in the AbstractSchemaProperty superclass. Added good 
citizen constructors to the Dublin Core and UK e-GMS SchemaProperty 
types, and static constructor methods in their respective Schema 
classes. Updated the Schema types to add this new flag where 
appropriate.  

Added an isUriQuery method to AbstractQueryBuilder to apply WHEN 
conditions directly to URIs, not to treat them as literal types. 
Added new test cases to check URI handling. Tidied up some JavaDoc  
comments in various places.

Friday 16 September
~~~~~~~~~~~~~~~~~~~~   
New screen shots of the query system.

--
MKSearch (alpha)

http://www.mksearch.mkdoc.org/

Free, open source metadata search engine with RDF storage and query.


More information about the MKSearch-dev mailing list