[MKSearch-dev] Week 48 and 49 round up
Phil Shaw
phil at mkdoc.com
Fri Sep 16 13:56:42 BST 2005
With a bit of help from the Sesame developers, I now have the query
system working. In principle, it will combine any number of Dublin
Core predicates selected by the query builder, though I haven't
tested it to destruction. It now makes sensible distinctions between
URI and other queries. It will also match documents whether the
primary data had an explicit encoding scheme or not; for example, it
makes a union of this:
{targetUri} dc:identifier {dcIdentifier}
... and this:
{targetUri} dc:identifier {} rdf:value {dcIdentifier}
Anyway, that's the last major task before packaging this up for a
beta release. I want to look again at JSpider configuration, and I
would like to get the whole thing running on FC4, so it may be a few
weeks yet.
Warning, this is a long report...
Best regards,
Phil
Tuesday 6 September
~~~~~~~~~~~~~~~~~~~
Had to remove a whole section of the GNU JAXP release because it
contained numerous coding errors, which have since been removed from
the merged Classpath version. Completed and stabilised JAXP
compilation on Cygwin, Windows and FC3.
Removed build directory from Ant build classpath.
Wednesday 7 September
~~~~~~~~~~~~~~~~~~~~~~
Added execute permissions to all bash scripts in Subversion. Worked
around a further GCJ compilation error in Sesame under FC3.
Thursday 8 September
~~~~~~~~~~~~~~~~~~~~
Added a statement to the AbstractRepositoryManager class to create
any parent directories for a repository storage file. Adjusted the
test case as necessary. Fixes the scenario where users have unpacked
a source distribution or checked out the Subversion source, which
does not include an output directory.
Added a getAttributeValue method to AbstractContentHandler to trim
leading and trailing white space from attribute values during
indexing. Modified all the meta/link writers to acquire their values
through the new method and added tests for these cases.
Monday 12 September
~~~~~~~~~~~~~~~~~~~~
Adapted AbstractResultRenderer to implement ResultRenderer, to access
the FULL constant. Introduced two-argument versions of
renderQueryResult to the interface and abstract superclass that pass
through calls with a default FULL mode argument.
Introduced an appendTerm method to the re-instate the (hex-encoded)
quotes around query phrases when reconstructing the query URL. Fixes
phrase search page navigation problems and RSS channel URL.
Introduced a handleDataQuery method to deal with RSS and other "pure
data" query types through the QueryFactory and ResultRenderer
interfaces. Any new data output types can now be handled simply by
adapting the QueryFactory.
Changed the QueryBuilderTag attributes to separate queryUrl and
builderUrl values. Updated relevant JSP pages and tag library
definition. Temporarily removed the RDF data output option from the
query form and added a results per page selector.
Changed the Tomcat server configuration to listen on port 80.
Tuesday 13 September
~~~~~~~~~~~~~~~~~~~~
Added extended query attribute to query result tags, tested and
temporarily set to "false" pending further development. Added the new
attribute to the tag library descriptor. Removed RSS result page from
server configuration.
Added an extended search parameter to AbstractQueryBuilder, with
conditional handling in key methods. Also added namespace
declarations to the addTitleExpression and addSummaryExpression
methods to fix case where the user's query does not include DC
Element predicates. Changed combination of conditional statements to
AND, rather than OR.
Added a new constructor to ServletQueryBuilder to request a extended
query capability. Applied the extended query parameters to the
HttpQuery class and QueryResultTag.
Added a hasEncodingScheme method to the SchemaProperty interface in
preparation to handle extended queries better. Applied the new method
through the AbstractSchemaProperty class and added constructors to
the three core implementations to carry-through: DCElementProperty,
DCTermProperty, UKeGMSProperty. Updated the DublinCoreElements and
DublinCoreTerms classes to specify the properties that may take an
encoding scheme, including conformsTo, created, format, hasPart,
hasVersion, identifier, isVersionOf, language, modified, references
and type.
Amended the RepositoryQuery class to take a file input for testing
SeRQL queries more conveniently. Added more specific error reporting
to help debug servlet based query echo. Removed leading spaces from
the subject fields of the working MKSearch RDF/XML index.
Wednesday 14 September
~~~~~~~~~~~~~~~~~~~~~~
Made various draft SeRQL queries to test changes the
AbstractQueryBuilder.
Changed the implementation of AbstractQueryBuilder to use a map of
SchemaProperty to List of query terms. This partly simplifies query
construction and also allows union joins of select statements with
extended bNode path expressions. Removed much of the former OR-query
based extended query mechanisms and extended the addCondition method
to handle the new construction scheme. Removed the original toString
method in favour of a getSeRQLQuery(boolean) method.
The new getSeRQLQuery method uses some existing methods and some new
methods to iterate through the SchemaProperty map to create the
query. A hasExtendableProperties checks whether a union query is
necessary despite such a request; appendPrimaryExpressions adds basic
path expressions; appendSecondaryExpressions adds extended bNode path
expressions; appendConditions forms AND conditions between
predicates, and OR conditions amongst multiple query terms on the
same predicate. Various loops and counts add parentheses, commas and
other syntax where necessary.
Adapted ServletQueryBuilder to use the new getSeRQLQuery method. Set
the extended properties of the JSP QueryResultTag and HttpQuery
servlet to true to activate extended bNode union queries. Made
various amendments and additions to the SeRQL test scripts to plan
the new query scheme.
Thursday 15 September
~~~~~~~~~~~~~~~~~~~~~
Added a hasUriEncodingScheme to the SchemaProperty interface and
implemented this in the AbstractSchemaProperty superclass. Added good
citizen constructors to the Dublin Core and UK e-GMS SchemaProperty
types, and static constructor methods in their respective Schema
classes. Updated the Schema types to add this new flag where
appropriate.
Added an isUriQuery method to AbstractQueryBuilder to apply WHEN
conditions directly to URIs, not to treat them as literal types.
Added new test cases to check URI handling. Tidied up some JavaDoc
comments in various places.
Friday 16 September
~~~~~~~~~~~~~~~~~~~~
New screen shots of the query system.
--
MKSearch (alpha)
http://www.mksearch.mkdoc.org/
Free, open source metadata search engine with RDF storage and query.
More information about the MKSearch-dev
mailing list