[MKSearch-dev] Tomcat on FC4

Phil Shaw phil at mkdoc.com
Tue Sep 6 16:55:48 BST 2005


On 31 Aug 2005, at 18:45, Phil Shaw wrote:

> On 31 Aug 2005, at 16:35, Chris Croome wrote:
> 
> > Compiling JTidy...
> > /usr/local/mksearch/lib-src/jtidy/org/w3c/tidy/Lexer.java:1278: error: Unrecognized character for encoding 'UTF-8'.
> >                    if (doctype != null)// #473490 - fix by Bj�rn H�hrmann 10 Oct 01
> >                                                              ^
> > /usr/local/mksearch/lib-src/jtidy/org/w3c/tidy/Lexer.java:132: error: Type ‘Lexer.W3CVersionInfo’ not found in declaration of field ‘W3CVERSION’.
> >        private static final Lexer.W3CVersionInfo[] W3CVERSION = {
> >                             ^
> > 2 errors

Chris, 

I believe this is fixed now, I added an explicit Latin 1 encoding 
argument to the compiler.

This update to JTidy required an update of GNU JAXP to version 
1.3. This snapshot was taken just before it was merged with the 
main classpath project for GCJ 4, as packaged in FC4.

The 1.3 version of JAXP contained a number of errors, but I have 
taken out the offending packages. The same has been done for the 
classpath version, so I suppose it's a known issue.

Now all library packages and MKSearch itself compile under GCJ 
3.3.4 on Cygwin and Sun JDK 1.4 on Windows. I have just indexed 
the main MKSearch site and it seems to run much faster than 
before, this may be down to improvements in JTidy.

The site has 2340 URLs, I used 5 spider threads with a throttle of 
500ms. It parsed 1330 documents in just under an hour and 
produced about 2.5MB of valid RDF/XML.

I'll test this on FC3 next. If you would like to have a go on FC4, 
please do. When it comes to using the $mk_home/bin/gij-
jspider.sh script, you may have to remove the reference to our 
version of GNU JAXP, to avoid conflicts with the merged classpath 
version. If so, take out this segment out of the line with the gij 
command, including the colon separator:

:$mk_home/lib/gnu-jaxp.jar

Best regards,

Phil








--
MKSearch (alpha)

http://www.mksearch.mkdoc.org/

Free, open source metadata search engine with RDF storage and query.



More information about the MKSearch-dev mailing list