DataparkSearch versions of 2005


Latest versions.
02 Dec 2005: 4.35, 1,988,949 bytes, 23.03.2007, 03:33 MSK
The Summary Extraction Algorithm (SEA) has been added.
Possible coredump has been fixed for robots.txt proccessing with incorrect value specified in Content-Encoding header.
The robots table has been added to cache robots.txt data for a period specified by RobotsPeriod command.
Some indexing speed improvements were made.
A new wtime column has been added to qtrack table to store time spent for search, in milliseconds. When upgrade, you need to add this column (e.g. using ALTER TABLE command) or recreate qtrack table.
Syntax error has been fixed in db creation script for MySQL.
Subnet command processing has been fixed for CIDR network format.
Memory leeak has been fixed in construction of all word forms using ispell data.
More accurate phrase segmenting has been implemented for queries in UTF-8 charset.
Language maps were added for several languages and UTF-8 charset.
Search query segmenting has been fixed for UTF-8 BrowserCharset.
31 Oct 2005: 4.34, 2,042,474 bytes, 23.03.2007, 03:29 MSK
Phrase segmenting has been fixed for mixed western and chinese, korean, thai writings.
A new switch -d for indexer has been added. Use it to sort indexing targets by Popularity Rank.
The ExpireAt command has been added to specify exactly time of document expiration.
The support for Crawl-delay command in robots.txt file has been added.
Internal text/xml parser has been rewritten, libexpat library isn't required anymore.
HTDBText command has been added for htdb:/ virtual scheme.
Word segmenter has been improved for Chinese, Korean and Thai.
Undefined reference to dps_memmove has been fixed.
A trap on empty search phrase has been fixed.
Some speed improvements has been made for the full relevancy calculation.
Rarery coredumping on language map update has been fixed.
Counting of non-uniform word distribution has been fixed in relevancy calculation.
Character set for usage with MeCab has been fixed.
The support for paranoia stack checking has been enhanced for compilation with optimization.
Several bugs were fixed.
16 Sep 2005: 4.33, 1,999,274 bytes, 23.03.2007, 03:26 MSK
OpenSearch 1.0 template has been added.
The ANYWORD (or '*') operator has been added for boolean search mode. This operator come true if both words have any word between.
Search words highlighting has been fixed for $^(x) template variables.
Excerpts construction has been improved.
Excerpts recoding has been fixed for case when no stored nor "DoStore yes" is used.
Processing of ExcerptSize and ExcerptPadding commands has been fixed for searchd connection.
Optional parameter &charset has been added for DBAddr command.
Minor memory leak has been fixed for the Neo PopRank calculation.
Automatic spelling correction has been added for indexing words. Use "AspellExtensions yes" command to enable. Aspell is required.
Automatic spelling correction for search terms has been added. You need to install aspell on your system.
Recoding of search query has been fixed for searchd connection.
The "near" search mode has been added. It's equal to the "all" mode, but finds documents where all search therms are within 16 words of each other.
The NEAR operator has been added for boolean search mode. This operator come true if both words are within 16 words of each other.
GrBeg and GrEnd search template commands were added. Use these commands to highlight consecutive following results if Google-like results grouping has been enabled.
Possible indexer trap with IDN support enabled has been fixed.
The value of $(PerSite) has been fixed for cached search results.
The support for libares library has been added.
Several bugs (include #168) were fixed.
30 Jul 2005: 4.32, 1,978,990 bytes, 23.03.2007, 03:23 MSK
Synonyms searching has been fixed to produce complete list of synonyms.
Processing of NOT boolean operator has been fixed for case when no documents found to delete.
The algorithm of full relevancy has been modified to get more speed and to correct values for case of big number of document sections.
Language and charset guesser has been tuned for case when contradictory data specified in server headers and meta tags.
The --with-bestavgpos switch for configure has been renamed to --with-bestpos.
Processing has been fixed for complex search query with acronyms and stopwords.
The dps_config script has been fixed to include MeCab related flags.
A possible trap has been fixed for search request with unclosed included phrase.
The Subnet command can now accepts subnet in forms: a.b.c.d/m, a.b.c, a.b, a
robots.txt has been fixed for case when the "*" User-Agent is divided onto two or more parts.
An unexpected exit of indexer has been fixed for cache dbmode when no cached is used.
The $(FancySize) meta-variable has been added for search templates. It show document size in bytes, kilobytes or megabytes, what match best.
Google-like results grouping has been added, use --enable-googlegrp option for configure to enable.
A possible trap has been fixed in case when a phrase specified inside search query.
A possible trap of search.cgi has been fixed in case when Locale command is used.
Several bugs (#164) were fixed.
17 Jun 2005: 4.31, 1,972,056 bytes, 23.03.2007, 03:21 MSK
Crosswords searching has been restored for sql-based dbmodes.
robots.txt processing has been fixed for gzip and deflate content encodings.
A possible trap has been fixed in boolean search mode.
Default comparison type has been fixed for ServerDB and SubnetDB commands.
Unicode data has been updated to 4.1.0 version.
Search words highlighting has been fixed in cached copy displaying.
More economic memory allocation has been implemented for indexing.
Several bugs were fixed.
31 May 2005: 4.30, 1,964,114 bytes, 23.03.2007, 03:18 MSK
The PopRankPostpone command has been added. Use it to skip the Neo PopRank calculation at indexing.
Fuzzy searching based on acronyms and abbreviations has been added.
The FlushServerTable command has been fixed.
A database creation script has been corrected for Oracle.
The Locale command has been added for search templates. Use it to change LC_ALL locale settings for search results.
Search query processing has been rewritten.
Missed page number calculation has been restored in mod_dpsearch.
Cached database checkup has been optimized for speed.
Indonesian language map has been added for ISO-8859-1 charset.
Several bugs were fixed.
08 Mar 2005: 4.29, 1,938,546 bytes, 23.03.2007, 03:16 MSK
Several large data files were excluded from distribution. You may download them from our site separately.
English, German and Polish synonyms list were added.
Thesaurus mode for synonyms files has been added.
A bug in section numbering for weighting at search time has been fixed.
$(WS) template variable has been added, it shows search words statistic in short form.
Possible memory leak has been fixed in cached when flushing empty buffer.
Persian language maps for ISIRI-3342 and UTF-8 charsets were added.
Error in processing of -w option on splitter has been fixed.
mod_dpsearch.so installation error has been fixed for Apache 2.0.53.
Maori and Maltese language maps for ISO-8859-1 and UTF-8 charsets were added.
New switches for configure were added: --disable-reldistance, --disable-relposition, --disable-relwrdcount, --with-bestavgpos, --with-wrdcntfactor. Use these switches to tune relevancy calculation.
Threadsafe hostname resolving has been added for FreeBSD.
Support has been added for Google's anti comment spam initiative.
IndexIf, NoIndexIf commands can now be loaded from server table using ServerTable command.
Possible hang on Neo PopRank calculation has been fixed.
Several bugs (include #158) were fixed.
17 Jan 2005: 4.28, 3,459,008 bytes, 23.03.2007, 03:13 MSK
Search word highlighting has been fixed for cached results.
Stored protocol has been changed. Please restart stored if upgrade.
TagIf and CategoryIf commands can now be loaded from server table using ServerTable command.
libidn detection has been fixed in configure.
Frequency dictionaries loading has been fixed in searchd.
A bug has been fixed in cached document displaying when stored is not used.
Relevancy calculation has been modified for queries with two or more words.
The URLCharset command has been added. Use it to specify character set only for arguments in Server, Realm or URL commands.
ServerDB, RealmDB, SubnetDB and URLDB commands were added. They works as Server, Realm, Subnet and URL command respectively, but takes arguments from field of SQL-table specified.
Several bugs were fixed.


Xiti