[MKSearch-dev] Week 3 round up
Phil Shaw
phil at mkdoc.com
Thu Oct 28 15:30:40 BST 2004
I'm doing a long week to complete my days for October. I haven't
finished yet but I wanted to write up what I've done before I forget
it. I'm about to start delving more deeply into spider source code
and feel I need to clear the decks...
As usual, all comments are welcome.
Best regards,
Phil
Monday
------
I've been trying to create a build environment in which GNU/Linux and
Cygwin usage is as close as possible to Windows usage. To date I have
been having problems running Ant and JUnit -- JUnit compiles but does
not run with GIJ and the error messages aren't terribly helpful!
I spent some time trying to compile a code coverage test framework I
like to use called Hansel, which depends on some Apache tools, and
this is where I decided to draw the line and take a layered approach
to the build framework.
I'll continue to investigate the problems running Ant/JUnit with the
GNU tools, but I separated out the Hansel tests and will keep the
other code conformance checks I like to run under Ant/Windows for the
time being.
http://hansel.sourceforge.net/
http://checkstyle.sourceforge.net/
http://pmd.sourceforge.net/
Started a more detailed package dependency analysis for the Web
spiders and RDF frameworks, to check where they rely on non-GPL code.
Modified jar scripts to take an implicit reference to the jar tool.
Tuesday
-------
Further package dependency analysis and licence research for Web
spiders and RDF frameworks. Completed the AbstractXMLReader class,
standard JUnit tests and Hansel coverage test. Updated the Ant build
script to the new layered testing scheme.
Wednesday
---------
Added JTidy to the library source and created scripts to compile,
archive and run under GNU/Linux and Windows. Completed dependency
analysis for RDF frameworks and excluded various tools from
consideration.
https://svn.mkdoc.com/mksearch/doc/licence/index.htm
Identified James Clark's XT as a suitable XSLT processor, if
required. Confirmed licence compatibility with GNU.
http://www.blnz.com/xt/index.html
Thursday
--------
Completed dependency analysis for Web spiders and excluded various
tools from consideration. Did some further work test running JUnit
with GIJ and identified a known compatibility bug with the Sun Java
interpreter. Running GCJ compiled code that includes inner classes
requires version 1.4 of the Sun interpreter or above.
I felt I needed to get a better grasp of the terms used in the
project proposal, so I updated the high level schematic with with
relations between the descriptive components -- Spider, Indexer,
Checker, etc. -- and the software components. Saved a print
resolution PNG and initial notes:
https://svn.mkdoc.com/mksearch/doc/design/MKSearch%20high%20level%20sc
hematic%20v0.2.png
http://www.mksearch.mkdoc.org/howto/system-components/
More information about the MKSearch-dev
mailing list