Crawl Anywhere 3.0.0 available

The new version 3.0.0 of Crawl Anywhere is now available. This new major release includes a lot of powerful enhancements. Here is the list of the main enhancements.

Crawler

Source import / export

Clear action

Crawl log history

Sitemaps support

Snacktory html page cleaning option

Custom metadata (global per web site or specific for web sites urls based on regex rules)

Http / Https protocol strategy

Pipeline

Snacktory html page cleaning 

Solr Indexer

Regex field mapping rules

Solr 4.0.0 integration

A preconfigured and patched (for multilingual anayzer) Solr 4.0.0 instance is now provided

Multilingual analyzer

A new version of the multilingual analyzer in available. This version allows detailed configuration for each language. The configuration syntax is the same as in Solr schema.xml file (see here the default configuration file : multilingual.xml).

You really need to use our Solr 4.0.0 patched version in order this analyzer works.

Tag cloud generator

A tag cloud analyzer is available. This analyzer extra extract n-terms expression (from document title titles by default). You can display dynamic tag cloud (by language, for a period) in your search interface.  

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">