2012.05.30, SJ

Switched from ctime to mtime in webui/model/quarantine/database.php

Added an extra index to clapfhistory.clapf table.

2012.05.23, SJ

Added a feature to mark incoming spoofed emails as spam.
It can be used _ONLY_ if you should never get any incoming
email with From: ....@yourdomainname.com. This might be the
case if your users send the outgoing emails skipping clapf.

To do so specify your domain list in mydomains, and set
mydomains_from_outside_is_spam, eg.

mydomains=example.com,aaa.fu,yourotherdomanname.com
mydomains_from_outside_is_spam=1


2012.04.23, SJ

Added a fix if the Received: line having the clapf id has more than
one space within.


2012.04.10, SJ

Mark the message as spam if it includes at least max_number_of_recipients_in_ham+1
recipients in a single smtp session - unless the sender (=mail from address) is
whitelisted.



2012.03.04, SJ

Released 0.4.7.2 as a bugfix and code clean up version.


2011.11.16, SJ

Code cleanup: removed boundary.c and updated list.c.

2011.11.13, SJ

Let spamdrop show locale settings in debug mode.

The smtp daemon will print its hostname before the ESMTP string.

2011.10.07, SJ

Hide the blacklist feature in the user/edit page
if it's disabled in config.php

2011.10.02, SJ

Fixed a typo in init.d/clapf-maillog.ubuntu.in

2011.09.05, SJ

Added qrunner.php to util/Makefile.in.

2011.09.02, SJ

Fixed a bug causing AD imported entries to be removed from the clapf database.
Let's say you have a user entry with dn='CN=jack,OU=noc,DC=aaa,DC=fu', and you
query 'DC=aaa,DC=fu'.

If you move this user to the sales department, ie. dn='CN=jack,OU=sales,DC=aaa,DC=fu'
the ldap_sync.php utility script will remove Jack. However if you run the script again,
it will reimport him.

The workaround does the same.

2011.09.01, SJ

Fixed a bug showing virus emails in the quarantine notification emails.

2011.08.30, SJ

Added an entry about the daily quarantine report to the Monitor/health check
webui page.

Omit the daily quarantine report to the list addresses, ie. to those addresses
that are in the t_quarantine_group table.

2011.08.29, SJ

Added a per user directory creating code snippet.
(credits: Gabor Varadi)

2011.08.26, SJ

Added a timestamp range limit to the lineChartHamSpam() function.

2011.08.22, SJ

Accidentally removed the util/db_init.sh script. Salvaged.

2011.08.18, SJ

Fixed a bug in the encoded header line parser routine which caused a segfault
in case of an unusual encoding, eg. Subject: Ablakcsere =?windows-1250?Q?=E9s =E1rny=E9kol=F3 p=E1ly=E1zati p=E9nzen?=

2011.08.04, SJ

The clapf daemon checks whether the 'username' variable is valid,
and it will quit if it's not.

2011.08.02, SJ

Minor webui improvement.

Added postgresql support to clapf, credits go to Gabor Varadi. Thanks!

2011.08.01, SJ

Minor webui improvements.

2011.07.28, SJ

Minor model/quarantine/database.php fix.

Added the option to display the outgoing postfix queue status if there's any.

2011.07.18, SJ

Improved user listing in the webui.

Fixed a bug in the fixupEncodedHeaderLine function.

2011.07.15, SJ

syslog the decoded subject line rather the raw encoded one.

Enhanced the history: added back the ham/spam select feature, and
you can specify even accented characters.

2011.07.13, SJ

Heavily changed the history database indexing for better performance.

2011.07.01, SJ

Minor quarantine fix: after removing an item, the webui will take you back
without putting your username to the url.

2011.06.30, SJ

Modified clapf to put virus infected messages to the quarantine of the given
user. Only the administrators can see these messages in the webui, and only
they can release it.

2011.06.27, SJ

Obsoleted the spam message counting cron entry.

2011.06.26, SJ

Minor webui chart fixes.

2011.06.23, SJ

Added a patch to mailgraph to support postscreen and clapf.

(credits: Gabri Mate)

2011.06.22, SJ

Modified the invocation of the mpstat command to let the result be unified
using the LC_ALL environment variable.

(credits: Gabri Mate)

Fixed a bug during masstraining. PHP doesn't like any variable having a dot (.)
character in its name, and it converts it to an underscore (_). If you have
usernames having a dot character it will break the mass training process.
The fix was to replace it by an asterisk (*).

(credits: Gabri Mate)

2011.06.17, SJ

Modified the log collector script to fake a clapf entry if the sender was
caught by an RBL list.

2011.06.15, SJ

Improved the generic init scripts.

Added maillog.pl status to the health monitor.



Released 0.4.7

2011.06.10, SJ

Added distribution list support for openldap using memberdn and memberaddr
attributes.

2011.06.08, SJ

Added support for AD distribution lists.

2011.06.06, SJ

Added a 'read-only' admin feature to be able to view the "Monitor"
section in the webui.

2011.05.31, SJ

Clarified some webui variables, eg. SMTP_HOST, SMTP_PORT, ...

2011.05.30, SJ

Added a global quarantine feature where administrators are able
to view the whole quarantine. This feature obsoletes the old
per user quarantine.sdb files.

2011.05.28, SJ

Added a global quarantine feature for administrators.

Webui enhancements.

2011.05.27, SJ

Added 'gid' support for the sqlite3 database backend.


2011.05.26, SJ

Several bugfixes, improvements.

2011.05.24, SJ

Improved the AD snychronisation utility to support multiple email
addresses, and omit X400 entries.

Changed the LDAP attribute name from proxyAddresses to proxyaddresses.

Fixed a bug in the webui to prevent users to login with a password having
an '&' character.

The maillog.pl script now correctly grabs the client name and IP-address
if the sender was rejected by a blacklist.



0.4.6:

2011.05.22, SJ

Released 0.4.6

2011.05.21, SJ

Modified the maillog collector script that (in case of mysql) it
puts data into table partitions by days, and it purges old partitions
by itself, ie. you don't need a cron job to do so.

If you are using sqlite3 history table then you _need_ a cron job
to purge aged history data.

Please note that the old history database is not compatible with the
new log entries and the maillog collector script, so be sure to drop
it, and recreate it from scratch.

2011.05.20, SJ

Bugfixes.

(credits: Gabor Varadi)

Simplified history feature.

2011.05.18, SJ

Fixed a nasty training bug causing a somewhat biased counters.

2011.05.17, SJ

Revised the logging of clapf to get a more efficient history.
You should also extend the clapf table in the history database:

alter table clapf add column `size` int default 0;
alter table clapf add column `from` char(64) not null;
alter table clapf add column `fromdomain` char(64) not null;
alter table clapf add column clapf_id char(32) not null;

Additionally, do not forget to update maillog.pl!

Changed the versioning of clapf for the nightly builds: it will be
"nigthly-YYYYMMDD" without the version number of closest release.

2011.05.15, SJ

Fixed a bug in spamdrop causing segfault due to an uninitialised
variable.

2011.05.14, SJ

Simplified the quarantine menu when viewing a message. Removed the "train"
and the toggle links to view the message in raw format.

2011.05.13, SJ

Extended the username field 32->64 characters.

Improved init scripts.

Extended the history/summary view.

Execute the following commands:

drop view summary;
create view summary as select distinct clapf.subject, clapf.queue_id, clapf.ts, smtpd.client_ip, qmgr.`from`, qmgr.`from_domain`, qmgr.`size`, smtp.`to_domain`, smtp.`to`, clapf.result, clapf.rcptdomain from smtp, smtpd, qmgr, clapf where smtp.clapf_id=clapf.queue_id and smtpd.queue_id=smtp.queue_id and qmgr.queue_id=smtp.queue_id;

Webui improvements.

2011.05.12, SJ

Fixed the ubuntu clapf init script, and removed the "reload"
section. It provided an error under debian 6.

Renamed the modified postgrey daemon to clapf-postgrey.

Cron fixes.

Health monitor fixes.


2011.05.08, SJ

Fixed a nasty training issue.

Description:

If you have less than 1000 ham emails or less than 1000
spam emails in the token database and the initial_1000_training
variable is set to 1 in clapf.conf (it is by default), then
checking a single email with spamdrop (eg. spamdrop -D < email)
will learn the email either as a ham or as a spam (it depends
on the original spamicity of the email).

To determine whether you are affected:

- you are not affected if you have at least 1000 ham and 100 spam
emails in your token database, AND/OR

- you are running version >= 0.4.6-stable or >= nightly-20110508, AND/OR

- you have neve used "spamdrop -D"

Fix:

- upgrade to the latest nightly build (20100508), OR

- apply the following patches:

--- a/src/spamdrop.c
+++ b/src/spamdrop.c
@@ -684,7 +684,7 @@
 
       if(
          (blackhole_request == 0 && debug == 0 && cfg.training_mode == T_TUM && ( (spaminess >= cfg.spam_overall_limit && spaminess < 0.99) || (spaminess < cfg.max_ham_spamicity && spaminess > 0.1) )) ||
-         (cfg.initial_1000_learning == 1 && (sdata.Nham < NUMBER_OF_INITIAL_1000_MESSAGES_TO_BE_LEARNED || sdata.Nspam < NUMBER_OF_INITIAL_1000_MESSAGES_TO_BE_LEARNED))
+         (cfg.initial_1000_learning == 1 && debug == 0 && (sdata.Nham < NUMBER_OF_INITIAL_1000_MESSAGES_TO_BE_LEARNED || sdata.Nspam < NUMBER_OF_INITIAL_1000_MESSAGES_TO_BE_LEARNED))
         )
       {
 
@@ -707,7 +707,7 @@
    /* if this is a blackhole request and spaminess < 0.99, then learn the message in an iterative loop */
 
 
-   if(blackhole_request == 1){
+   if(blackhole_request == 1 && debug == 0){
       if(spaminess < 0.99){
          rounds = MAX_ITERATIVE_TRAIN_LOOPS;
 
 

--- a/src/antispam.c
+++ b/src/antispam.c
@@ -240,6 +245,9 @@
 
       if(sdata->spaminess > 0.9999) snprintf(reason, SMALLBUFSIZE-1, "%s%s\r\n", cfg->clapf_header_field, MSG_ABSOLUTELY_SPAM);
 
+      /* discard any training in debug mode */
+      if(cfg->debug == 1) goto END_OF_TRAINING;
+

 

Workaround:
- set initial_1000_learning=0 in clapf.conf, OR
- never use "spamdrop -D" until you get at least 1000 ham and 100 spam emails in your token database.


Credits: Tamas Papp


Removed the t_stat table. It's no longer used.

2011.05.07, SJ

spamdrop (in debug mode) will print out the nham and nspam
counters of each tokens in addition to the spamicity and
deviation values..

clapf will syslog the fact if the sender is found on mynetwork.

2011.05.05, SJ

Enhanced the parser the skip <style>blabla</style>. Spammers
usually use this 'fetaure' to hide word salad, or disclaimers
from legitimate emails.

2011.05.03, SJ

Released 0.4.6-rc3.

2011.04.29, SJ

Modified the parser to handle the Content-* headers better.
The modification will introduce new tokens, as well as discard
some old tokens, so this may affect your accuracy slightly.

2011.04.28, SJ

Added ESET support to clapf. To enable it, use the --enable-eset
configure option.

Please note that you need a fully working ESET Mail Security product 
installed. Other ESET product is currently not supported.

To enable the clapf + ESET cooperation, configure ESET daemon as usual,
ie. as an smtp service (eg. 127.0.0.1:2525).

Then configure postfix to pass emails to ESET (and not clapf!).

Configure ESET to pass emails to clapf (eg. 127.0.0.1:10025), and
finally clapf will pass the email back to postfix.

To sum it up:

postfix -> ESET (127.0.0.1:2525) -> clapf (127.0.0.1:10025) -> postfix (127.0.0.1:10026)

The ESET daemon is needed to configure the following way (esets.cfg):

##################
av_clean_mode = "none"
action_av_infected = "accept"
av_eml_header_modification_mask = "clean:cleaned:deleted:infected:notscanned"
av_eml_header_template = "%avstatus% %virus%"

listen_addr = "127.0.0.1"
listen_port = 2525
server_addr = "127.0.0.1"
server_port = 10025
##################

The config above will tell ESET to pass all virus infected emails, after
putting "X-EsetResult: infected virusname" to the mail header. Clapf
checks for this line, and takes the appropriate action on it.


2011.04.22, SJ

Parser fix to correctly handle multiline header fields.

2011.04.19, SJ

Fixed a bug in the webui affecting domain admins listing users.

Minor fix in the white-/blacklist query function.

Minor change in the logging of clapf. It will syslog "status=dropped"
if the message was discarded by clapf.

If the sender email address is found blacklist, then clapf will syslog
'SPAM' instead of 'HAM'.

2011.04.18, SJ

Added an Outlook macro to automate the trainig, and make it very easy.
This will create two extra buttons on the menubar, so retraining is
a matter of a single click.

You can find it in the contrib/plugin/outlook directory. See the
README file for deployment.


2011.04.17, SJ

Fixed a bug preventing clapf to send virus notification emails.

2011.04.16, SJ

Fixed logging in src/avir.c

2011.04.13, SJ

Enhanced quarantine report with i18n. Please look at
webui/languages/*/quarantine-daily-digest.tpl

2011.04.12, SJ

Minor fix in memcached.c

Fixed a bug causing segfault in certain cases if are using libclamav.

2011.04.11, SJ

Modified the mynetwork feature: the "127.$" is no longer added explicitly.
If you need it, you have to add it manually, eg.
mynetwork=127.$,11.22.33.44,....

2011.04.05, SJ

If postfix does not send any XFORWARD info (ie. it's disabled in main.cf OR
the email was submitted locally without the smtpd service), the client_addr
variable will be set to 'null' (ie. a 4 letter string).

So you can set the following to skip antispam tests for emails sent locally:
mynetwork=null

Beware, that exim has no 'XFORWARD' feature, so you should _NOT_ define 'null'
in your mynetwork.

Added the mynetwork feature to the counters, so be sure to extend the t_counters
table, then flush the clapf related memcached entries.

alter table t_counters add column mynetwork bigint default 0;

2011.04.04, SJ

Changed the default "mynetwork" feature to contain only the local
network (127.0.0.0/8).

2011.04.02, SJ

Added the mynetwork feature to prevent clapf running spam check on
emails coming from certain IP-addresses.

The mynetwork variable is a similar comma separated list like skipped_received_ips.

Let's say, your postfix box running clapf relays emails coming from your LAN
(11.22.33.0/24), and you don't want clapf to spamcheck these emails.
All you have to do is to specify your LAN, eg. "mynetwork=11.22.33."

If the client IP-address matches "mynetwork", clapf will add
"<clapf-header-field>: mynetwork" to the message header.

Note 1: clapf checks for viruses even in emails coming from "mynetwork".
Note 2: the following addresses are explicitly on "mynetwork": 127.,192.168.,172.16.,10.

To use this feature, you have to activate XFORWARD in postfix, ie.
/etc/postfix/main.cf:

lmtp_send_xforward_command = yes
smtp_send_xforward_command = yes
smtpd_authorized_xforward_hosts = 127.0.0.0/8

2011.04.01, SJ

Added a health monitor feature to the webui. It displays the smtp status of
clapf and postfix. It also reveals the output of qshape. Qshape is a postfix
utility, see http://www.postfix.org/QSHAPE_README.html for details.

To enable the qshape stats in the health monitor, you have to create the
following cronjob for _root_ user:

*/5 * * * * PATH=$PATH:/usr/local/sbin /usr/local/bin/qshape > /var/lib/clapf/stat/active+incoming
*/5 * * * * PATH=$PATH:/usr/local/sbin /usr/local/bin/qshape -s > /var/lib/clapf/stat/active+incoming-sender
*/5 * * * * PATH=$PATH:/usr/local/sbin /usr/local/bin/qshape deferred > /var/lib/clapf/stat/deferred
*/5 * * * * PATH=$PATH:/usr/local/sbin /usr/local/bin/qshape -s deferred > /var/lib/clapf/stat/deferred-sender

It's important that the postqueue utility must be in the default path or you have to adjust the
PATH environment variable as above.



The setup utility is able to read data from previous config.php
in case of an upgrade.

Lowercase all email addresses and domains in the maillog collector script.

History fix. The webui form calls the loadHistory() function right after clicking
on the "Set" button.

2011.03.31, SJ

util/summary.php fix.

Added a new utility to send a daily quarantine report (util/quarantine-daily-report.php)
You can run it as php /usr/local/libexec/clapf/quarantine-daily-report.php /path/to/webui/installation

2011.03.30, SJ

Log the IP-address to the webui log.

2011.03.18, SJ

Webui setup enhancements.

2011.03.17, SJ

SQLite3 fix around white/blacklist query.

2011.02.24, SJ

Modified the webui to disable remote images. Add the following variables to the config.php:

define('REMOTE_IMAGE_REPLACEMENT', WEBUI_DIRECTORY . 'view/theme/default/images/remote.gif');
define('ENABLE_REMOTE_IMAGES', 0);

Added a new variable: days_to_retain_data. It defines how long to
retain data in the webui cache and quarantine directory (14 days
by default).

To use it, you have to modify the cron jobs, eg:

1 2 * * * find /var/lib/clapf/queue -type f  -atime +`/usr/local/sbin/clapfconf -q days_to_retain_data|cut -f2 -d'='` -exec rm -f '{}' \;
1 2 * * * find /srv/www/webui.yourdomain.com/cache -type f  -atime +`/usr/local/sbin/clapfconf -q days_to_retain_data|cut -f2 -d'='` -exec rm -f '{}' \;


2011.02.23, SJ

Minor webui fix.

2011.01.26, SJ

Don't parse the not interesting part of the Received: lines, ie.
after "by ".

2011.01.18, SJ

Enhanced the parsing of the "Received: from" lines to properly
handle non-postfix MTAs.

Doubled the MAXVAL value, so the clapf.conf lines (=key+value)
can be up to 256 characters.

2010.11.23, SJ

Fixed a character set and collation issue in the webui.
If you see strange characters in the 'real name' column,
you may need to fix them:

alter table user change `realname` `realname` char(64) character set latin1;
alter table user change `realname` `realname` blob;
alter table user change `realname` `realname` char(64) character set utf8;
alter table user character set utf8;

alter table t_policy change `name` `name` char(128) character set latin1;
alter table t_policy change `name` `name` blob;
alter table t_policy change `name` `name` char(128) character set utf8;
alter table t_policy character set utf8;

2010.11.18, SJ

Updated the recvtimeout() function to honor the seconds value.

2010.11.06, SJ

Fixed a parser bug.



0.4.6-rc1:

2010.10.28, SJ

Fixed a typo around the octet-stream checking code.

2010.10.26, SJ

Webui enhancements.

2010.10.25, SJ

Parser modifications for a smarter HTML tokenisation.
It won't create chained tokens between HTML elements
and text any longer.

This modification may have a slight impact on your
accuracy in the short run, however it would definetely
improve accuracy in the long run.

2010.10.23, SJ

Added a check to see if an application/octet-stream attachment
is really binary. If not, then let's tokenize it.

2010.10.21, SJ

Added zombie to the counters.

2010.10.20, SJ

Added logging support to the webui.

2010.10.18, SJ

Fixed a minor issue around creating the spool directory during
install time.

Purged LDAP support from the clapf source tree.

2010.10.17, SJ

Removed LDAP support from the webui. However, it can import
users from Active Directory or openldap to its local database.

I also created a cli version of the LDAP sync tool (util/ldap_sync.php).
It should be used as "php /usr/local/libexec/clapf/ldap_sync.php /path/to/webui/"

You have to upgrade the webui, run setup, then edit config.php
and set LDAP_IMPORT_CONFIG_FILE. This variable should point to
a configuration file holding LDAP definitions. For your own
good, this file should NOT be acessible through the web.

The configuration file syntax is as follows:

ldaphost:basedn:binddn:bindpw:type:domain:gid:policy_group

eg.
localhost:ou=domain1_fu,dc=aaaa,dc=fu:cn=clapfadmin,dc=aaaa,dc=fu:thepassword:openldap:domain1.fu:3:0
localhost:ou=domain2_fu,dc=aaaa,dc=fu:cn=clapfadmin,dc=aaaa,dc=fu:thepassword:openldap:domain2.fu:5:0

'type' is either "AD" or "openldap"

Hint: You should create an entry for all of your domains. Thus
you can manage all of your domain

2010.10.14, SJ

Tamas Papp introduced a training shell script useful for dovecot
installations.

Removed the "autocomplete=off" reference from the login page. It
caused a problem with Safari.

The webui won't remove any email from the queue directory, only
hides them. You have to use a cron job to purge aged messages.

The quarantine and the user listing in the webui can be sorted
and ordered by various fields.

2010.10.03, SJ

Minor webui fixes.

2010.09.20, SJ

Added LDAP support to the token group concept (see below).

Reverted the qmail.schema definition to its original form,
and moved all the clapf related user attributes to clapf-user.schema.
You need to reference this in your slapd.conf.

By default clapf makes gid to be the same as uid. You can change
this to add a clapfgid attribute to the given user, and make sure
to extend his objectClass to include 'clapfUser' as well, eg.

ldapmodify -x  -D "cn=Manager,dc=yourdomain,dc=com" -W -f upgrade.ldif

upgrade.ldif:

dn: cn=user1,ou=clapfusers,dc=yourdomain,dc=com
changetype: modify
replace: objectClass
objectClass: top
objectClass: person
objectClass: qmailUser
objectClass: qmailGroup
objectClass: clapfUser
-
add: clapfgid
clapfgid: 123

Added LDAP support to spamdrop, and fixed an LDAP related
#ifdef reference in the clapf daemon.


2010.09.17, SJ

Added token group support for the clapf daemon. (LDAP users,
please see a note at the end).

You have to extend the user table to add a new field, 'gid':

mysql> alter table user add column gid int unsigned not null;

Important! If you have memcached support enabled, then you have
to release/flush all clapf related entries. (The easiest way
to do this is to restart the memcached daemon).

If you are using a shared token database or you are using sqlite3
database for the tokens, then you are done, the rest should read on.

Execute the 'contrib/db/add-gid-to-user-table.sql' script:

$ mysql -u clapf -p clapf < contrib/db/add-gid-to-user-table.sql

It contains a stored procedure to walk through your user table,
and sets the gid field to be the same as the uid field. From
now on, the gid field specifies the uid of the tokens to be used.
After executing it, it causes clapf to behave just as the 0.4.5.

If you are using a merged database, but you are not interested in
the 'token group feature' (see below), just make sure the gid field
is the same as the uid field, and you're done.


What is the purpose of the token groups concept? Let's say you are
a service provider with multiple domains. With the token groups
you can create a common token set for domain1, then a different
token set from domain2, etc.

To do this, set the gid to the same for all users in domain1, eg:
update user set gid=123 where uid in (1,2,3,4,5,6, ...);

and do this for the users in the domain2, eg:
update user set gid=124 where uid in (45,46,47,48, ...);

Then the token set for all users in domain1 can be "... WHERE (uid=0 OR uid=123)",
and it can be "... WHERE (uid=0 OR uid=124)" for all users in domain2.


Notes for LDAP users: the attribute of the new 'gid' field is not
decided yet, so with --with-userdb=ldap configurations, clapf
simply sets gid=uid for now (ie. fallback to 0.4.5 behaviour).


0.4.5:

2010.09.12, SJ

Webui fixes for a wider (in length) user list.
(credits: Tamas Papp)

2010.09.09, SJ

Yet another spamc related fix.

2010.09.08, SJ

Moved the initSessionData() function back to src/session.c in
order to let antivirus only installations to compile as well.

Fix to let the spamc emulation mode be compiled.

Minor webui fix.

2010.09.07, SJ

Minor webui enhancements.

2010.09.04, SJ

Fixed a bug calculating the time spent in the 'user' phase.

2010.09.03, SJ

The webui honors the MIN_PASSWORD_LENGTH variable when the administrator
wants to change a password.


0.4.5-rc3:

2010.08.27, SJ

The configure script would abort if you specified --with-userdb=ldap
and you have no ldap develpment packages installed.

Massive LDAP changes to support multiple OUs under LDAP_USER_BASEDN.

2010.08.24, SJ

LDAP related fix in the webui.

2010.08.23, SJ

Fixed a bug in the LDAP version of updateUser() function of the webui.

2010.08.22, SJ

Fixed a few bugs in the LDAP user handling in the webui.
(credits: Toldi Miklos)

2010.08.19, SJ

Fixed both the webui setup and the corresponding LDAP model to properly
handle the t_misc table if the user data are stored in LDAP.

2010.08.18, SJ

Let the regular users see their own statistics in the webui.

Fixed a bug in the webui setup that hides the fact that you have to
specify some database parameters in case of a mysql history.
(credits: Toldi Miklos)

2010.08.17, SJ

Modified the LDAP policy scheme to allow accented policy names.

Enhanced the maillog.pl script to allow you to define an arbitrary
socket path.
(credits: Toldi Miklos)

2010.08.16, SJ

Fixed some ubuntu init scripts.
(credits: Toldi Miklos)

Fixed the ldap auth script to use the user's dn instead of blindly
join the cn to the base dn.
(credits: Toldi Miklos)

Added an option to the config.php (PASSWORD_CHANGE_ENABLED) to allow or
disallow the users to change their passwords.
(credits: Toldi Miklos)

Fixed a minor template bug.
(credits: Toldi Miklos)

2010.08.13, SJ

Fixed an ldap related confusion. The webui regards the blacklist entry,
however the clapf daemon was looking for the filterSender entry. From
now the daemon also uses the blackList entry.

2010.08.12, SJ

Fixed a whitelist related bug causing clapf to crash introduced at 2010.08.10.

2010.08.10, SJ

Webui fixes. Added a "realname" field, so you should extend the
user table with the following statement:

alter table user add column realname char(32) default null;

Fixed a memcached bug.

Whitelist/blacklist entries can be put to memcached. Their sizes
are limited by MAXBUFSIZE, see config.h

2010.08.09, SJ

style.css fixes.

2010.08.08, SJ

Highlighted the selected rows in the quarantine.


0.4.5-rc2:


2010.07.29, SJ

Added a webui fix not to show empty (=0 byte) messages
in the quarantine view.

2010.07.22, SJ

Fixed the fixupHTML() function.

2010.07.21, SJ

Fixed a weird truncating problem at the statistical
aggregating function. Code was copied from the
bogofilter project.

Fixed the zombie regexp file to exclude web80006.mail.sp1.yahoo.com

2010.07.20, SJ

Fixed a bug truncating the state.hostname.

2010.07.19, SJ

Code refactoring.

2010.07.16, SJ

Introduced a html_tag array to define html tags to be skipped.
Currently we skip "<html>, </html>, and </body>. So we an get
rid of this ugly insanity of Word 11:

<html xmlns:v=3D"urn:schemas-microsoft-com:vml" =
xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:x=3D"urn:schemas-microsoft-com:office:excel" =
xmlns:p=3D"urn:schemas-microsoft-com:office:powerpoint" =
xmlns:a=3D"urn:schemas-microsoft-com:office:access" =
.... >


2010.07.15, SJ

Modified the parser to skip html comments in a smarter way.

2010.07.13, SJ

From now the parser does not tokenize the skipped Received: lines,
and includes the @ sign as part of a token, and excludes email
addresses from the token hash. We only tokenize the sending hostname
and IP-address, and skip the rest of the Received: tokens.

The parser skips email addresses from To: lines. The reasoning behind
this is that the recipient's (=your) email address is relatively the
same for both ham and spam messages. However the 'undisclosed-recipient'
is a very telling token, just as the names in the To: line.

2010.07.09, SJ

The following list is explicitly added to the skipped_received_ips
option: "127.,10.,192.168.,172.16.". The skipped_received_hosts
options is removed.

The percent (%) character is now part of the 'Subject*' tokens.

Modified the parser to produce even triplets from the Subject line, ie.
from the line "Subject: 80% Sale invitation for Henry" you will get eg.:

Subject*80%+sale+invitation
Subject*sale+invitation+for
Subject*invitation+for+henry

Also note that I changed the parser to simplify the Subject token pairs.
Until this you got eg. "Subject*sale+Subject*invitation", however from
now it reduced to only "Subject*sale+invitation". This change means that
you will lose your current token pairs in the Subject line, however I
expect that - at worst case - it will cause a slight and temporary
drop in the accuracy.

Exclude www.w3.org from the URL* tokens.

2010.07.08, SJ

Faster training by mass creating new tokens in the SQL database.
Fixed a bug that was creating unnecessary URL* tokens from header elements.

2010.07.07, SJ

Statistics fix.

2010.07.01, SJ

Fixed a libclamav related compile issue.
(credits: Tamas Papp)

2010.06.30, SJ

Added a modified postgrey to contrib/zombie. See contrib/zombie/README
and http://clapf.acts.hu/wiki/doku.php/howto:catching-zombies-with-a-modified-postgrey
for details.

A lineChart cosmetic fix.

Added support for icc compiled mysql distributions. The configure script detects if you
have installed intel-icc9-libs-9.0-i386.tar.gz. You can get it at http://dev.mysql.com/downloads/os-linux.html
and activate it at compile time eg. LDFLAGS="-L/usr/local/lib/intel-icc9-libs-9.0-i386" ./configure ...

Updated the configure.in script, and the aclocal.m4 macro file.

2010.06,28, SJ

Automatically exclude 127.x.x.x, 192.168.x.x, 172.16.x.x, and 10.x.x.x
addresses in the isItemOnList() function.

2010.06.27, SJ

Fix ".." sequences.

Quoted-printable fix.

Submit only URLs with at least a single dot (.) to SURBL check.

Rewrote the mysql part of the training process.

2010.06.24, SJ

Updated the zombie regexp list.

2010.06.10, SJ

Fixed some make and compile time issues, and fixed the schema of the "summary" view.
Additionally fixed a bug in the chart rendering.
(credits: DRu)


0.4.5-rc1:

2010.06.08, SJ

Modified the webui menu structure with jquery to add a dropdown menu.
Also added a counter page where you can see, and even to reset the
counters both in memcached and in the database.

2010.05.27, SJ

Better token reassembly.
(src/parser_utils.c)

2010.05.26, SJ

PDF reports are possible with the stat/summary.php (by using the FPDF PHP library).
You may create some cron entries, eg.

1 0 * * * LIBEXECDIR/clapf/summary.php /srv/www/webui.yourdomain.com "clapf daily report" "Ham & spam messages" daily
1 0 * * mon LIBEXECDIR/clapf/summary.php /srv/www/webui.yourdomain.com "clapf weekly report" "Ham & spam messages" weekly
1 0 1 * * LIBEXECDIR/clapf/summary.php /srv/www/webui.yourdomain.com "clapf monthly report" "Ham & spam messages" monthly


2010.05.24, SJ

Modified the history schema, and the maillog.pl script.

alter clapf add column rcpt char(64) default null;
drop index clapf_idx;
create index clapf_idx on clapf(queue_id, result, rcpt);

Also removed the spamstat utility along with the stat/spamstat.pl
perl script.

2010.05.23, SJ

Improved the statistics page of the webui. It has three more
graphs.

Add the following lines to config.php:

define('TABLE_SUMMARY', 'summary');
define('SIZE_X', 430);
define('SIZE_Y', 250);

and create a view to the history database:

create view if not exists summary as select distinct clapf.queue_id, clapf.ts, smtpd.client_ip, \
    qmgr.`from_domain`, smtp.`to_domain`, smtp.`to`, clapf.result from smtp, smtpd, qmgr, clapf \
    where smtp.clapf_id=clapf.queue_id and smtpd.queue_id=smtp.queue_id and qmgr.queue_id=smtp.queue_id;


(webui/*)

Gathered all the cron jobs to the etc directory.
(/etc/Makefile.in, etc/cron.jobs.in)

2010.05.19, SJ

Heavy restructuring of the source tree.

2010.05.16, SJ

Obsoleted the spaminess_of_text_and_base64 config variable. clapf will
no longer mark a message as spam merely because it has a base64 encoded
textual part.

(cfg.*, defs.h, example.conf, score.*, bayes.c, parser.c, parser_utils.c)

2010.05.14, SJ

I have improved the tokeniser. It properly handles HTML entities like &aacuate;
and is able to skip preconfigured (via clapf.conf) relay entries.

The tokeniser uses only the hostnames - having at least a single dot (.) - and
IP-addresses from the Received: lines. It's no use to create tokens like
"by", "from", "esmtp", "8.12.5", dates, etc.

Improved the zombie protection. If your smtp server is behind some other servers
that hands emails to you, then postfix will always pass their hostname/IP-address
to you via the XFORWARD mechanism.

So from now, you can define your smtp hosts have to be discarded, to find out the
real sender host. Let's say, you want to skip *localhost, *.aaa.fu and *.gaaa.fu
as well as 127.0.0.1, 195.56.111.1-255 and 84.15.1.8[0-9] in the Received: lines.
The set the following:

skipped_received_hosts=localhost$,.aaa.fu$,.gaaa.fu$
skipped_received_ips=127.0.0.1$,195.56.111.,84.15.1.8

(parser*.c, cfg.*, antispam.c)

2010.04.23, SJ

Discovered a weird problem in libmemcached-0.3[89]. If you compile
clapf statically, though memcached_mget succeeds, however one cannot
fetch the records from the results. So I have written my own minimalistic
poor man's memcached client. Note that it supports only a single memcached
server on the standard tcp port (11211). If you need a more complex
solution, just let me know.
(memc.c, memc.h)

2010.04.17, SJ

Finalised counter implementation.
(counters.c, session.c)

2010.04.16, SJ

Fixed a few memcached related issues.
(memcached.c, counters.c)

2010.04.13, SJ

Added counters to memcached.
(defs.h, clapf.h, config.h, memcached.c, clapf.c, antispam.c, session.c)

2010.03.19, SJ

clapf will try to create its directories if you start it by root.
The Makefile will not do that for you any longer.
(clapf.c, dirs.c, Makefile.in)

2010.02.23, SJ

Strip the trailing whitespace from user management search form. So 'username ' would work too.
(webui/model/user/*/user.php)

2010.02.21, SJ

Added '-Q' option to spamdrop to return only 0 (=ham) or 1 (=spam).
(spamdrop.c)

2010.02.19, SJ

Fixed a typo in the policy handling.
(webui/model/policy/ldap/policy.php)



0.4.4:

2010.02.18, SJ

Fixed a spamdrop bug. It appended the CRLF.CRLF sequence at the end of emails,
and placed a ^M at the end of the clapf related header lines. I changed the
CRLF definition in config.h as well.
(spamdrop.c, config.h)

2010.02.17, SJ

Various webui fixes.
(webui/*)

2010.02.16, SJ

Let the master admins decide if they want to send trainings they do
to the global database or not. It's effective only if you are using
a merged token group.
(spam.c, webui/*)

Fixed spamsum compile time warnings.
(contrib/spamsum/spamsum.c)

2010.02.15, SJ

The webui setup utility honors the protocol (ie. http or https),
and writes it to the config.php.
(webui/setup.php)

The mysql driver prints its error in a more user friendly way.
(system/database/mysql.php, controller/common/error.php)

2010.02.12, SJ

Fixed a webui formatting bug.
(webui/model/quarantine/database.php)

2010.02.11, SJ

Fixing a NUL character problem in the From/Subject lines.
(webui/*)

2010.02.10, SJ

Spamdrop_helper fixes.
(credits: Gabor Varadi)
(spamdrop_helper.c)


0.4.4-rc2:

2010.02.10, SJ

Massive webui changes to support 'fast' training, save search history.
It also puts quarantine data to an sqlite3 database under the users'
quarantine directory.
(spam.c, webui/*)

2010.02.08, SJ

Added an order feature to the user list (uid, username, domain).
It works for an SQL backend at the moment.
(webui/*)

2010.02.07, SJ

Fixed a webui bug in login/login showing you the 'You are'
message as twice if you are actually logged in.
(webui/view/theme/default/templates/login/login.tpl)

Let the MIN_PASSWORD_LENGTH variable be honored by common/home.php.
(webui/controller/common/home.php)

Users' quarantine defaults to show the spam emails. Of course
they can set to see all messages, or only the hams.
(webui/controller/login/login.php)

Let a wait cursor be displayed while the history is loading.
(webui/view/theme/default/templates/common/layout-history.tpl)

If the message to be trained is not in the per user quarantine
directory, then it tries the "archived" directory in his quarantine.
(spam.c)

Webui encoding fixes.
(webui/model*/quarantine/message.php)

2010.02.05, SJ

spamdrop.c fixes.
(credits: Gabor Varadi)
(spamdrop.c)

Fixed the user preferences issue. Now the webui stores the
user preferences in a SQLite3 based session database. Make
sure to give write access to both the sessions directory
and the sessions.sdb file.
(webui/*)

Let the quarantine remember what search terms you have set
while delivering, removing, training, etc. messages.
(webui/*)


2010.02.04, SJ

spamdrop_helper.c changes.
(credits: Gabor Varadi)
(contrib/spamdrop_helper/spamdrop_helper.c)

spamdrop enhancements.
(credits: Gabor Varadi)
(spamdrop.c)

Reverted the webui training changes, because the webui removed
the message before postfix handed it to clapf to learn, so clapf
produced an "invalid signature" (or similar) error message.
(webui/controller/quarantine/*train.php)

2010.02.03, SJ

Let the webui forward training requests to postfix instead of clapf.
(webui/controller/quarantine/*train.php, webui/setup/setup.php)

Simplified quarantine list page.
(webui/*)

2010.02.01, SJ

Fixed the display of the page length at route=common/home. It
also uses sessions instead of cookies.
(webui/*)

Let the config parser syslog if an unknown key was found, otherwise
it will interfere with the output of spamdrop.
(cfg.c)

Added a procmailrc script, you may try it if you don't like maildrop.
(credits: Tamas Papp)
(procmailrc)

Fixed a webui bug preventing the administrators to set a password
shorter than 8 characters.
(webui/controller/user/edit.php)

Fixed another webui bug, and let the values of a new policy be the
same as the clapf.conf defaults.
(webui/controller/policy/add.php)

Modified the logout() function, and it won't destroy all the sessions,
only unsets the username and admin_user variables. Thus the other session
related settings (language and pagelen) may survive the logout as
long as the session is alive.
(webui/system/misc.php)

2010.01.28, SJ

Added a history purge script, and fixed the ubuntu init script.
(credits: Tamas Papp)
(history/history/history-purge-*sql, init.d/clapf.ubuntu.in)

Enhanced the webui to able to show you even the ham messages,
and allows you to train the false negatives (ie. missed spam
emails).
(webui/*)

Changed the default path of the pidfile to /var/run/clapf.
(config.h, Makefile.in)

Added the history specific config variables to example.conf
(example.conf)

2010.01.26, SJ

Added the Ubuntu and Redhat specific clapf init scripts, and
let the make process set the PREFIX in them.
(init.d/*)

You can set/redefine the webui language on the fly.
(webui/*)


0.4.4-rc1:


2010.01.25, SJ

Modified the webui to be compatible with PHP 5.3.1. You have to either run the
setup utility or replace "sqlite3" to "sqlite" in your config.php.
(webui/*)

2010.01.21, SJ

Modified the clapf daemon to user the 'username' config file
variable instead specifying -u and -g to set the uid and gid.
(cfg.c, cfg.h, clapf.c, errmsg.h, config.h)

2010.01.20, SJ

maillog.pl enhancements.
(history/maillog.pl, cfg.c, cfg.h)

2010.01.18, SJ

Webui enhancements.
(webui/*)

Fixed a to= and orig_to= parsing issue.
(history/maillog.pl)

2010.01.15, SJ

Fixed a locale bug in parsembox.c causing to miss accented characters.
(parsembox.c)

2010.01.14, SJ

Modified the Makefile to let other than gnu make be able to compile clapf.
(Notably on Solaris and FreeBSD)
(Makefile.in, contrib/spamsum/Makefile.in, contrib/spamdrop_helper/Makefile.in, configure*)

2010.01.13, SJ

Fixed a bug in the webui prevented the admin users to remove a spam from
a user's quarantine, and forced admin user to leave the user's quarantine
after the unsuccesful removal.
(webui/controller/quarantine/remove.php)

Fixed a bug in the maillog.pl script missing the domain part of the 'to='
addresses having '=' sign (usually list addresses).
(history/maillog.pl)

2010.01.12, SJ

Modified the history schema and the collecting perl script to gain a
performance boost.
(history/*, webui/controller/history/worker.php)

2010.01.11, SJ

Added a new variable 'queuedir' to let you define its path to your will.
(cfg.h, cfg.c, config.h, Makefile.in, example.conf)

2010.01.10, SJ

Added a search/filter feature to the webui history. It requires to
accept cookies. You are allowed to set filter criteria at the 0th page,
ie. page=0.
Whenever you set a filter term it will be used at the next AJAX request,
ie. expect a ~5 sec delay.
(webui/*)

Added an escaping to the sender and recipient domains.
(webui/controller/history/worker.php)


0.4.3.1:


2010.01.06, SJ

If you have specified the '-r' switch for spamdrop, then it tried to
query the database even before it was opened, and it produced a segmentation
fault. Fixed.
(spamdrop.c)

2010.01.03, SJ

Fixed the maillog.pl script to skip lines if they do not have a valid to field.
(history/maillog.pl)

Webui setup fixes.
(webui/setup/setup.tpl)

2010.01.02, SJ

Now the webui handles different history databases.
(webui/index.php, webui/controller/history/worker.php)

Fixed an URL related parser bug introduced in translate2().
(misc.c)

2009.12.30, SJ

spamdrop is now setuid, so we can always enter the workdir, even if we
just test a single message with spamdrop -D
(spamdrop.c, Makefile.in)

Older sqlite3 versions (eg. 3.3.x) do not have the -batch command line options.
The configure script checks for the presence of the -batch option.
(Makefile.in, configure*)

libclapf.so* link fixes.
(Makefile.in)

2009.12.29, SJ

Updated the stat/process_syslog.pl script to match the current logging style.
(stat/process_syslog.pl)

2009.12.28, SJ

Created the contrib/spamdrop_helper directory with
Varadi Gabor's spamdrop_helper.c
(contrib/spamdrop_helper)

Fixed the blackhole routine.
(black.c)

Added a mysql version of history.
(webui/setup/*, webui/controller/history/worker.php)

Simplified the statistics.
(stat/clapf-stat.run.sh)

Added a fix to the configure script to include the spam.o and antispam.o
object files if using --enable-spamc-emul
(configure*)


0.4.3:

2009.12.21, SJ

Parser fix.
(misc.c)

Various spamdrop fixes.
(spamdrop.c)

(credits: Gabor Varadi)

2009.12.19, SJ

Spamdrop fix.
(spamdrop.c)

(credits: Gabor Varadi)

2009.12.18, SJ

A few users complained about the merge of the spam status and the
probability value. So reverted the change, but you can use the '-o'
command line switch for this compact behaviour.
(spamdrop.c)


0.4.3-rc2:


2009.12.17, SJ

Modified the clapf_admin.sh script to handle sqlite3 databases, too.
(Makefile.in, util/clapf_admin-sql.sh.in)

2009.12.16, SJ

The spamdrop utility removes the clapf related header lines from the header only.

It's also possible to suppress all normal syslog activity (except for errors)
if using the '-q' command line switch or specifying verbose=0 in the config file.

Changed spamdrop to print the spam status and the spaminess probability in a single
row, eg. 'X-Clapf-spamicity: Yes, 1.000' or 'X-Clapf-spamicity: No, 0.0001'.
(spamdrop.c)

2009.12.15, SJ

Minor fixes for various parts of clapf.
(cfg.c, config.h, Makefile.in, spamdrop.c, util/db_init.sh, util/purge-sqlite3.sh, util/spamdrop_helper.in)

(credits: Gabor Varadi and Christoph Wilke)

2009.12.11, SJ

Add an extra '${clapf_header_field} statistically whitelisted' header
if the message is statistically whitelisted.
(defs.h, session.c, spamdrop.c, bayes.c, antispam.c)

2009.12.10, SJ

Christoph Wilke contributed some fixes related to the spamsum module. Thanks!
(contrib/spamsum/ssum.c, contrib/spamsum/Makefile.in)

(credits: Christoph Wilke)

2009.12.02, SJ

Added an init script.
(Makefile.in, init.d/clapf.in)

Splitted the util/db_init_* scripts to util/db_init.sh for the
creation of the database schema, and util/db_train.sh for training
from scratch. I have also unified the training scripts, so you have
to specify the database type as well (ie. either sqlite3 or mysql).

2009.12.01, SJ

Enhanced the maillog.pl utility to be able to handle either sqlite3
or mysql database.

Usage examples: maillog
maillog.pl -l /var/log/maillog --db sqlite3 --database /var/lib/clapf/log.sdb
maillog.pl -l /var/log/maillog --db mysql --database history --user history --password veryhardsecret

(history/maillog.pl)

2009.11.27, SJ

Added a utility called zombitest. It's purpose is to test whether
a given host matches the zombienet.regex file. Usage: eg.

zombietest mail-gx0-f212.google.com triband-mum-59.184.161.130.mtnl.net.in

If you need this utility then issue a "make zombitest"
(zombietest.c)

2009.11.11, SJ

Fixed a nasty bug in smtp.c. If clapf gets an email from another
clapf, then it fails to remove the last line of the foreign clapf
instance causing evaluation problems. The bug was introduced with
0.4.3-rc1, so previous versions are believed not affected.
(smtp.c)

2009.11.09, SJ

Added a new utility (clapfconf) to show configuration variables, just
as the postconf utility does.
(clapfconf.c)

2009.11.08, SJ

Code cleanup. Created a brand new config file parsing utility.
(cfg.c, cfg.h, clapf.h, clapf.c, parsembox.c, misc.c, misc.h, hash2.c,
config.h, avir.c, example.conf, spamstat.c, Makefile.in, clapfconf.c)
(credits: turul16)

2009.11.06, SJ

Removed a bug prevented clapf to train messages in multiple rounds iteratively.
(bayes.c)

2009.11.05, SJ

If the messages is evaluated as a possible spam, we can put a
special header field too. By default the followinf field is set:
clapf_possible_spam_header_field=X-Clapf-spamicity: maybe
(antispam.c, cfg.c, cfg.h, example.conf)
(credits: tompos)

Removed the unnecessary t_queue table reference from the purge scripts.
(util/purge*)


0.4.3-rc1:


2009.11.03, SJ

Add a [spam???] prefix to the Subject: line if the message probability
is greater than possible_spam_limit (0.8 by default) and less than
spam_overall_limit (0.92 by default).
(smtp.c, cfg.c, cfg.h, example.conf)

2009.11.02, SJ

Code cleanup.
(cfg.c, config.h)

2009.10.29, SJ

Added a 32kB mail buffer to skip reading the temp file for
parsing and handing back to postfix. clapf performance gained
only 0.6-2.1% in my tests so this code is not compiled by default.
If you need this extra performance, then add '-DHAVE_MAILBUF' to
the 'DEFS' line of the Makefile, and recompile clapf.
(defs.h, parser.c, parser.h, smtp.c, session.c)

Fixed a configure script bug searching for the tre.h header file.
(configure*)

2009.10.27, SJ

Replaced all signal() calls with sig_action().
(clapf.c, session.c, sig.c, sig.h)

2009.10.26, SJ

Added a strcasestr() compatible function in case of the target system
has no _GNU_SOURCE in the header files.
(misc.c, misc.h)

2009.10.22, SJ

Reset the alarm clock after processing a message. Thus a child process
does not have to be killed by a forced exit after the default 420 secs.
(session.c)

2009.10.21, SJ

The webui will keep the entered data in case of a user add failure
to save you typing user data again (except the password fields).
(webui/controller/user/add.php, webui/view/theme/default/templates/user/add.tpl)
(credits: tompos)

2009.10.19, SJ

clapf syslog()'s the number of rounds it used for training
a message.
(bayes.c, spam.c)

2009.10.16, SJ

Added a new counter to the 'delays=' to show how much
time clapf needs to acquire a message from postfix.
(session.c, defs.h)

2009.10.05, SJ

Added lmtp support to history/maillog.pl
(history/maillog.pl)

Changed the size of acceptbuf 8192->512 bytes, and a single
RCPT TO: line should fit in 256 bytes otherwise clapf sends
an 550 too long recipient message back.
(session.c, defs.h, spam.c)

Modified clapf to extract tokens from the training message only,
and discard the stored message itself. This change has obsoleted
the --enable-outlook-hack configure option, since clapf searches
the Received: lines for valid clapf ids.
(spam.c, misc.c, parser.c, webui/controller/quarantine/*train.php)

2009.10.04, SJ

Moved the antispam part from session.c to antispam.c, and
additional code cleanup.
(session.c, antispam.c, defs.h, clapf.h)

2009.10.02, SJ

Minor webui fixes.
(webui/*)

TRE includes fixes.
(defs.h)

More locale fixes. If you use a policy settings, then the previous setlocale
bug came again. Now I set only LC_MESSAGES and LC_CTYPE according to the
current locale settings clapf.conf.
(clapf.c, spamdrop.c)

MySQL optimisations.
(credist: cydrk)
(db-mysql)

2009.09.29, SJ

Fixed a bug in the queue files handling.
(spam.c)

2009.09.28, SJ

Modified clapf to relay the XFORWARD info.
(defs.h, session.c, smtp.c)

The verbosity level affects the info put into the mail header.
(session.c)

Skip the faked 'Received: ' line if found in the header.
(smtp.c)

If the tre library matches a zombie host, then a special spammy
token is instantiated.
(hash2.c)


2009.09.27, SJ

Most spam emails come from zombie networks where the infected
hosts send the email directly instead of using their service
providers smtp servers. Postfix can be set to pass the XFORWARD
info that clapf can use.

I have compiled a regex file (zombienets.regex) to list/match
most of them, eg. c-98-214-27-79.hsd1.il.comcast.net. Many of
the zombies has no (valid) PTR record, so you may append the
word 'unknown' to match them.

Currently I run this test to see how it performs. Later I will
add a configuration variable allows you to mark these emails
immediately as spam. Alternatively you can configure postfix
to reject messages coming from directly end user computers.

To enable this feature, install the TRE library, and copy
zombienets.regex to /usr/local/share/clapf

Finally set 'message_from_a_zombie' variable to decide what to
do with messages from zombies. See example.conf for help. The
default action is let is pass and let the antispam engine
handle it.

(clapf.c, defs.h, session.c, session.h, zombienets.regex, configure*, config.h, clapf-config.h.in)

2009.09.26, SJ

Revised the queue file storage to let the clapf daemon
support store-less training.

If you set store_metadata=1, then clapf will store the
metadata needed for training. In case of store_metadata=0
clapf will store nothing, and you cannot train it by
forwarding emails to spam@ or ham@.

So with store_metadata=1 clapf is able to train messages
even if the original message itself does not exists in
the queue directory.

If you omit the --with-store=... configure option, then
clapf writes an empty file acting as an identifier used
for training, no matter what store_only_spam is.

If you do specify the --with-store=... option, then clapf
saves the message itself, suitable for viewing the quarantine
as well. If you set store_only_spam=1, then clapf stores
the spam messages only, and writes an empty file for the
good emails.

(smtp.c, parser.c, spam.c, session.c, clapf.h, defs.h)

2009.09.25, SJ

Added the store_only_spam config variable to the policy.
(db-*sql, ldap/clapf-policy.schema, policy.c, webui/*)

2009.09.23, SJ

clapf emits the "child exited" syslog message only in debug mode.
(session.c)

Added the memcached stuff to the source tree.
(memcached_cleaner.c, mysql.c)

Added a default setlocale() call before (re)loading the configuration.
Let's you have set locale=hu_HU. clapf starts, and reads variables as
the default locale value (probably) en_US. Then later you change the
configuration and send a HUP signal to let clapf re-read its config.
It will, however with the current (hu_HU) locality. Thus all your float
variables are toasted if you have written them as 0.92 (and not 0,92).

To prevent nasty things to happen I included a setlocale(LC_ALL, "en_US")
system call before any configuration read.

Workaround: stop and start clapf after any configuration changes.
(clapf.c)

0.4.2:

2009.09.20, SJ

A few bugfixes causing clapf to crash on some x64 platforms.

(credits: Tamas Papp)
(users.c, misc.c, session.c, black.c, spamdrop.c)

2009.09.18, SJ

The webui will write the correct error message in case of a short
password. Additionally you can specify the minimum password length
in config.php (MIN_PASSWORD_LENGTH)
(webui/)

2009.09.17, SJ

Removed the CLAMAV_EXTRA_LIBS definition from Makefile.in. Earlier
--enable-libclamav installations needed if. Now it appears that we
are fine without it.

(credits: Tamas Papp)
(Makefile.in)

0.4.2-rc1:

2009.09.11, SJ

Fixed a bug in the history prevented to show dropped emails
(ie. viruses).

Administrators can view the user's graphical statistics too.
(history/*, webui/)

2009.09.10, SJ

Modified the training routine to break after the first round
before the spaminess check in case of a TOE mode.
(bayes.c)

Removed the DHA stuff. The blackhole/minefield feature should take
care of it. The postfix will probably catch invalid recipients and
not letting the to clapf.
(cfg.c, cfg.h, example.conf, session.c)

Revised the blackhole/minefield feature. Since most of the hosts are
rather grey instead of being black or white, than marking a message
as spam merely on the base that it sent us spam to the trap address
is probably a bad idea. (Hint: shared websites)

So clapf will generate a special spammy token if the sender host is
on our minefield.

Also note that I have changed the way clapf stores the IP-addresses.
Scanning a large directory takes a considerable time so clapf will
put entries to a mysql table (t_minefield). Look at db-*sql, and be
sure to create it before upgrade.
(black.c, session.c, spamdrop.c)

2009.09.09, SJ

Removed the history/archive controller. The history/worker is smart
enough to do the job alone.
(webui/)

Added a feature to AD import: the webui import/query routine saves
the ldap connection info (except the ldap password) to the 't_remote'
table, and is able to recall it later. So you have to type the bind
password only.
(webui/)

2009.09.08, SJ

Added an errorstring if login has failed.
You can add multiple domains in the domain menu.
Give a correct error message if no entries in the history log
instead of an thrown php error.

(webui/)

2009.09.07, SJ

Added a check to import from Active Directory, and skip adding already
exsiting email addresses.
(webui/)

2009.09.06, SJ

Added a message history view to the webui. It supports sqlite3 backend
currently. To use it create a database according to the history/mail.sql
schema, then run the history/maillog.pl script that reads the maillog
and collects data available for the webui.

The maillog.pl script needs the File::Tail, Date::Parse abd DBD::SQLite3
moduls. You can get them from the CPAN site.
(history/*, webui)

2009.09.02, SJ

Simplified logging. clapf will syslog a single line if verbosity=1.
I have introduced a very extensive time measurement, so you can spot
bottlenecks. You can see how many seconds (sic!) takes to parse the
email, the statistical decision, updating the tokens, and inject back
to postfix. You will be surprised to see that handing the email back
to postfix takes the most time...
(avir.c, misc.c, session.c, spamdrop.c, smtp.c, stat/spamstat.pl, stat/process_syslog.pl)

2009.09.01, SJ

Webui fixes.
(db-*sql, webui/*)

2009.08.31, SJ

Added policy support for sqlite3 too.
(policy.c)

2009.08.25, SJ

Changed the per user quarantine path splitting. The default is the
old one created by the username. However you can change to the new
uid splitting by the '--enable-uid-splitting' configure option. 

If you choose the new uid splitting be sure to select the the same
in the webui setup script.
(spam.c, clapf.h, spamdrop.c, example.conf, webui)

2009.08.24, SJ

Enhanced the webui to show the image(s) if the spam has any.
Please create a "cache/" directory under your webui, and let the web
server has write permissions on it. I recommend you to extend the quarantine
purge script to the cache directory to remove aged spam images.

The webui lets you to add email aliases too when adding a new user. You
don't have to add user then edit him to define additional email addresses.

You can also define the size attribute of the user section in the config.php.
Add the following two lines to your config.php (or rerun the setup script):

define('CGI_INPUT_FIELD_WIDTH', 50);
define('CGI_INPUT_FIELD_HEIGHT', 7);

Polished the HTML code to be more w3 standard friendly.
(webui/*)

Extract the .tld from the long hostnames in the "Received: from" lines.
(misc.c, misc.h, parser.c)

2009.08.21, SJ

Changed the way clapf organises the queue directory. Now we query the uid of the
given user, create a path like 10000/100/95/h|s.<message-id>, instead the old
method using j/jack.
If you have an existing install, just enable the enable_old_queue_compat variable
(disabled by default) in clapf.conf until all the per user queue files are purged
from the old directory structure (~7 days).
(spam.c, clapf.h, spamdrop.c, example.conf, webui)

2009.08.13, SJ

Force sdata.uid=0 if the message landed on the minefield (blackhole)
(spamdrop.c, session.c)


2009.08.05, SJ

Introduced a new configuration variable 'max_number_of_tokens_to_filter'.
Though some spams have a few words they may be bigger than 'max_message_size_to_filter'
due to their binary (pdf, mp3, gif, ...) attachment. I have introduced a new
configuration variable, 'max_number_of_tokens_to_filter', so clapf will still
check a message if its size is greater then 'max_message_size_to_filter' AND
has fewer tokens than 'max_number_of_tokens_to_filter'.

By default max_number_of_tokens_to_filter=2000.

(example.conf, cfg.c, cfg.h, session.c, spamdrop.c)

The parser create .tld domains as well, ie. URL*chinaspammer.cn, and URL*.cn, too.

(misc.c, misc.h, parser.c)


2009.08.03, SJ

Modified the user listing if using sql backend. Now the webui lists the users
in one instance even if they have multiply email addresses. You can view/edit
all the email addresses in the edit page in a textarea field.

Additionally improved the user search feature, the webui searches for the given
pattern in the usernames and email addresses.


2009.07.29, SJ

Fixed a few bugs in the ldap handling.

2009.07.27, SJ

Finished ldap support for the brand new webui.
(webui/*)

2009.07.24, SJ

Rewritten the whole webui using the model-view-controller (MVC) method.

The MVC method isolates business logic from the user interface,
permitting one to be freely modified without affecting the other. The
controller collects user input, the model manipulates application data,
and the view presents results to the user.

The new webui features an installer. Please note that the old config.php
is not compatible with the new webui, so run setup/setup.php.

Please note that the new webui support only the sqlite3 and the mysql
user db. The LDAP support is on the way.

If you are using sqlite3, then update the database schema (see db-sqlite3.sql)
and add the t_policy table.
(webui/*)

2009.06.30, SJ

Fixed an encoding bug caused improper view of iso-8859-2 messages
(webui/funcs.php)

Added the clapf_admin.sh utility to do batch user add or removal.
(util/clapf_admin-sql.sh.in, util/clapf_admin-ldap.sh.in, Makefile.in)

Removed the clapf_admin binary
(clapf_admin.c)

2009.06.26, SJ

Fixing some typos resulting compile errors in avir.c
(avir.c)

Added avast! home personal edition support. As its name suggests
it may be suitable for very small networks since it spawns the
avast binary to do the dirty job.

(avir.c, avast.c, configure*, cfg*, example.conf)

2009.06.25, SJ

Removed the spamtest utility from the installation.
(Makefile.in)


************************************


0.4.1:

2009.06.15, SJ

Administrators can see the aggregate ham/spam statistics, as well as
they can specify the uid, if the y are curious the graph of a given
user, eg. .../webui/stat.php?uid=1001
(webui/graph.php, webui/stat.php)

2009.06.11, SJ

LDAP fixes in both schema and webui.
(webui/ldap.php)

2009.06.09, SJ

Skip updating the tokens if we are in debug mode.
(spamdrop.c)

LDAP query fix.
(users.c)

2009.06.04, SJ

Fixed minor quarantine related issue in the webui.
(webui/q.php, webui/funcs.php, webui/lang/*)

2009.06.03, SJ

Updated the spamstat utility according to the sql schema modifications.
(spamstat.c)

Minor enhancements in the webui.
(webui/lang/*, webui/login.php, webui/webui.php, webui/funcs.php, webui/q.php)

A comment warns you in config.in.php to edit the proper configuration
file (config.php) in the webui.
(webui/config.in.php, Makefile.in)

Modified the spamdrop utility to print most of its stuff to stdout instead of stderr.
(lots of *.c files)

2009.06.01, SJ

Put the add_penalties() function after the token
update function to prevent overwriting a special
token.
(parser.c, bayes.c, hash.c, hash.h)

2009.05.29, SJ

Modified the spamdrop utility to able to handle
virtual users, and provide the functionality of
the spamtest utility if using the -D command line
option, ie. 'spamdrop -D < message' or
'spamdrop -D -r jim@aaaa.fu < message'
(Makefile.in, maildroprc, spamdrop.c)

2009.05.27, SJ

Updated the libclamav related stuff due to change in the
clamav API.
(avir.c, clapf.c, clapf.h, session.c, session.h)




0.4.1-rc2:

2009.05.26, SJ

Modified the webui to display the formatted email. You can
see the raw email as an option in the menu.
(webui/q.php, webui/funcs.php, webui/lang/*)

2009.05.24, SJ

Removed the man pages. They were not up-to-date, and maintaining
docs at two places are draining.
(doc/ Makefile.in)

Changed the webui encoding to utf-8.
(webui/lang/hu/messages.php)

2009.05.22, SJ

The train_message() function also needed a similar fix.
(credits: Attila Bergsmann)
(bayes.c)

2009.05.20, SJ

Fixed an issue caused by the bayes_file() function overwrites the uid if you are using a
if you are using a shared token group (ie. group_type=0).
(bayes.c)

2009.05.19, SJ

Webui fixes.
(webui/*)

2009.05.18, SJ

Added the '--enable-outlook-hack' configure option to workaround
the nasty habit of Outlook 12 (not Outlook Express).
(misc.c, smtop.c, configure*)

2009.05.17, SJ

Modified the database schema: introduced a new table storing
uid <-> email address relations for the MySQL and SQLite3 backend.
Please note that --with-userdb=ldap users are not affected.

Be sure to read the 0th point in the UPGRADE file!
(db-*.sql, util/db-update-*.sql, users.c, config.h, webui/.htdb.php, webui/lang/*, webui/*)

2009.05.14, SJ

Added clapf message id to the logging of the training part.
(bayes.c)

2009.05.12, SJ

Fixed a bug preventing the proper handling of users with uid=0.
(webui/mysql.php, webui/sqlite3.php, webui/users.php)

2009.05.05, SJ

Modified the webui to support login via a HTML form instead the
popup login window.
(webui/*)

2009.05.01, SJ

Users may change their passwords.
(webui/ldap.php, webui/mysql.php, webui/sqlite3.php, lang/*, webui/index.php)

2009.04.30, SJ

Added the password and 'isadmin' fields to the webui. From now
the .htpasswd authentication is no longer available. Webui passwords
are stored in the corresponding database or ldap directory.
(webui/ldap.php, webui/mysql.php, webui/sqlite3.php, webui/webui.php, ...)

2009.04.24, SJ

Added a password field to the sql schemas, and an extra
field (isadmin) to determine if the user is admin or not.
The webui is extended for these features.
(db-*sql, ldap/qmail.schema, ldap/example.ldif, webui/ldap.php)




0.4.1-rc1:

2009.04.22, SJ

The jpgraph library was huge, so I replaced it with libchart
http://naku.dohcrew.com/libchart/pages/introduction/.
(webui/libchart)

2009.04.21, SJ

Added a graphical representation of the per user statitics by
the JpGraph project (http://www.aditus.nu/jpgraph/).
You need PHP 5.x for the 2.x series of JpGraph.
If you stuck with PHP4, then download the 1.x version
of JpGraph.
(webui/stat.php, webui/graph.php, webui/jpgraph)

Pipelining support for the smtp receiving side of clapf.
(smtpcodes.h)

2009.04.17, SJ

Modified boundary handling in the parser.
(configure*, Makefile.in, boundary.c, boundary.h, parser.c, defs.h)

2009.04.15, SJ

Added a command line utility called smtpscanner. It can be used
as a content filter for any MTA (eg. sendmail) as follows:

Configure two instances of your MTA. Instance "A" accepts messages
from the network, then passes to smtpscanner*. Smtpscanner then
passes the (perhaps) modified message to instance B on a port
like 10026.

Please note that if you are using postfix, then stick to the regular
"advanced content-filter" method, as it is more resource efficient.

Plase also note that smtpscanner should NOT be used with libclamav,
as loading the database into memory has an overhead, use clamd instead.

If you want to use the smtpscanner, then issue a "make smtpscanner" as
this utility does not compiles by default.

*: it is an interactive version of session.c. It reads data from stdin
and prints the answers to stdout.

(smtpscanner.c)

2009.04.14, SJ

Moved the parse_message() function to make less object
files if having only antivirus functionality.
(bayes.c, bayes.h, parser.c, parser.h, session.c)

2009.04.10, SJ

Fixed a bug in the language files.
(webui/lang/*)

2009.04.09, SJ

Added paging support to user list.
(webui/users.php, webui/mysql.php, webui/sqlite3.php, webui/ldap.php)

2009.04.02, SJ

Fixed a bug in spamdrop preventing to train a message bigger than
max_message_size_to_filter, and spamdrop can determine now correctly
your userid if you train from the command line.
(spamdrop.c)

2009.03.18, SJ

Fixed a bug causing incorrect sql statements if using the t_black_list
table.

(credits: efpe)
(users.c, session.c, spamdrop.c, clapf.h)

You can do bulk user editing. If you select some user ids in users.php
and click on the "Bulk edit selected uids" button, then you can set the
policy group, white- and blacklist for the selected uids.

You can remove the selected uids (and all of their email addresses) by
clicking on the "Remove selected uids" button.

(webui/massusers.php, webui/lang/*)

You can search between usernames and email addresse.
(webui/mysql.php, webui/users.php)

2009.03.16, SJ

Fixed a bug, now clapf handles properly such a pipelining
situation where an incomplete QUIT command follows the period
(.), ie.

# end of packet1:
  6f 70 65 72 6c 79 0d 0a    0d 0a 2e 0d 0a 51 55 49    operly.......QUI
#
# this is packet2:
T 127.0.0.1:45721 -> 127.0.0.1:10025 [AP]
  54 0d 0a                                              T..             

(session.c)

2009.03.12, SJ

Modified clapf to let a training request get through if we
cannot find the user it's intended to. This is the case if
you have to use an SMTP server running clapf, but the training
request should reach another SMTP server with clapf, ie.

user -> smtp1 -> smtp2

Now clapf will not swallow the training request, so it may
reach its next destination.
(session.c)

2009.03.10, SJ

Added an option to the webui to set the page length of the
quarantine. You can set it between 10 ... 100.
(webui/index.php, webui/q.php)

You can do mass training and delivery from the quarantine
with selecting the messages, then with a single click.
(webui/mass* webui/funcs.php, webui/lang/*)

Separated the HELO and the EHLO/LHLO answers. We return a
mere "250 Ok" if the SMTP client tells us "HELO".
(session.c)

2009.03.09, SJ

Enhanced a webui that if the given user has no entry in
the t_white_list or t_black_list table, then the set_whitelist()
and set_blacklist() functions create the necessary entries.

Updated the UPGRADE file, too.

(webui/mysql.php, UPGRADE)
(credits: epfe)


2009.03.07, SJ

Webui enhancements. The quarantine listing now tries
to decode the 'From:' and 'Subject:' fields to let
the users have a clue what is that message is about.

You can remove all your spam emails from the quarantine
with one click on the "Purge all messages from quarantine"
button.

(webui/funcs.php, webui/q.php)

2009.03.04, SJ

Even more bugfixing with the pipelining code.
(session.c)

2009.03.03, SJ

Fixed a bug on the receiving side of the pipelining code.
(session.c)

2009.03.02, SJ

It looks a bit better and neater if the spaminess buffer
is appended to the end of the header instead the beginning.

(smtp.c)

2009.03.01, SJ

Separated the header manipulation to the send_headers() function.
This produces a more readable smtp session handling code.

Modified the placement of the spaminess buffer. Now it sends
before the actual message, so you may find it right after the
last (ie. first) Received: lines.

(smtp.c, session.c)

2009.02.28, SJ

Added PIPELINING support when injecting the message back to
postfix. It works, but a little testing is required. So if
you do not want this behaviour, then replace EHLO with HELO
in smtp.c.

(smtp.c)

2009.02.27, SJ

Added an option to clapf.conf to enable conditional virus scanning
of the incoming emails. If you enable the always_scan_message
variable then clapf will send every email to the virus scanner.
This may help against phishing attacks, as clamav is able to
mark these scams as viruses.

However at a greater load, it may be better to spare the AV
scanner from textual only messages. So if you set always_scan_message=0
then clapf will send emails to the antivirus engine only if they
contain a base64 encoded part. This is the case when the message
contains a zip/exe/doc/pdf/... attachment. 

Changed the mysql database opening stuff to connect only once at
the start of the clapf child handling the SMTP session. Postfix
may keep a connection open for 300s, so it may be a good idea to
spare the mysql_real_connect() before processing a single SMTP
transaction. More tests follow to see how this setup handle the
restart and loss of the mysql server.

(session.c, cfg.c, cfg.h, example.conf)


2009.02.24, SJ

Miklos has discovered two bugs:

Bug #1: a typo caused clapf not to put the spam_subject_prefix
value to the Subject line if the value was shorter than 22
characters.

Bug #2: the per user policy settings did not override the default
policy settings when it comes to decide whether to drop the spam
spam or not.

(session.c, smtp.c)

(credits: Miklos Toldi)


Changed the Makefile to install clapf to the ${prefix}/sbin
directory.

(Makefile.in)


2009.02.18, SJ

If you train with spamdrop from the command line, you may
use the FROM environment variable to train for a different
user, eg.

FROM=someother@email spamdrop -S < message

Alternatively you can use the -U command line parameter to
specify what clapf user-id to train.

Example 1: train a message for jack (assuming that his clapf user-id is 1007)

spamdrop -U 1007 -S < message

Example 2: train the global database

spamdrop -U 0 -S < message

(spamdrop.c)


2009.02.17, SJ

Removed the unused insert_2_queue() function.

(mysql.c, sqlite3.c)

2009.02.16, SJ

You can start clapf as root, and you can specify what uid and gid
you want clapf to run (after it binds to the given port).

(clapf.c)

Added an experimental spamd support. So if you want to replace
amavis with clapf you are good to go.
I recommend to use either the regular statistical anti-spam
module or the spamc emulation.

If you want to try how it performs, the use the --enable-spamc-emul
configure option, and set the spamd_addr, spamd_port and spamc_user
variables at the end of the clapf.conf file.

(cfg.c, cfg.h, session.c, spamc.c)

2009.02.16, SJ

Possible to use per domain specifications in the user
table. Let's say you want to give some specific settings
to the domain @aaaa.fu, but something special to user1@aaaa.fu.
Then setup user1@aaaa.fu, and instead entering every possible
other users in the @aaaa.fu domain, add only the @aaaa.fu
domain itself.

mysql>select * from user;

uid |   username   |   email       | policy_group |
----+--------------+---------------+--------------+
1001| user1        | user1@aaaa.fu |      14      |
1001| @aaaa.fu     | @aaaa.fu      |       0      |

....


(users.c)

2009.02.13, SJ

You can disable the update of the token timestamps by
setting update_tokens=0 in clapf.conf. This is useful
if you do not want to run the periodic purge script,
and you want to spare the UPDATE t_token .... SQL command.

If you are running the periodic purge script, leave
the default settings, ie. update_tokens=1

(cfg,c, cfg.h, session.c, spamdrop.c, example.conf)

2009.02.06, SJ

Added a blacklist functionality (similar to the whitelist). If
the sender is on the blacklist then clapf will syslog and discard
the message. Please note that the whitelist check takes precedence
and if you happen to put a sender to both the whitelist and the
blacklist, the email will get through.

If you need this feature, then create the following sql table:

--
create table if not exists t_black_list (
        uid int unsigned not null primary key,
        blacklist blob default null
);

create index t_black_list_idx on t_black_list (uid);
insert into t_black_list (uid) values(0);
--

If you are using ldap, then you may use the 'filterMember'
attribute to store the blacklisted email addresses.

(users.c, clapf.h, session.c, spamdrop.c, db-mysql.sql, db-sqlite3.sql)

2009.02.04, SJ

Revised the store concept. You can use the --with-store option
to specify where to store the emails for later retraining and
for spam quarantine.

If you omit this configure option, then clapf will not keep a
copy of the incoming emails. You may specify 'fs' (without
the quotes) to keep messages on the local filesystem, or
you may use 'mysql' if you prefer to keep them in the t_queue
mysql table.

If you are running clapf to filter spam on multiple machines,
this should be your choice.

Please note that the --enable-outgoing-smtp configure option
is deprecated. You may get the same effect if you don't use
the --with-store option.

(spam.c, session.c, spamdrop.c)

Modified the policy definition to store the store_metadata
variable.

(policy.c, webui/mysql.php, webui/ldap.php, webui/policy.php,
db-mysql.sql, ldap/clapf-policy.schema)

2009.02.03, SJ

Enhanced the spamdrop utility. The '-p' option is deprecated,
spamdrop prints the message by default. If you use the '-d'
option spamdrop will use the delivery_agent specified in clapf.conf
to deliver the given message.

So if you don't want to use the SMTP content-filter daemon, try
the following in master.cf:

spamdrop        unix - n n - - pipe
		user=clapf argv=/usr/local/bin/spamdrop -d
		-f ${sender} -r ${recipient}

Then add "-o content_filter=spamdrop" (without the quotes)
to the smtp service, eg.

smtp      inet  n       -       n       -       -       smtpd
        -o content_filter=spamdrop


Additionally spamdrop can use several antivirus engines.

Please note that spamdrop will print the clapf related extra
headers to the beginning of the header.

(spamdrop.c, cfg.c, cfg.h, example.conf)

2009.02.02, SJ

User support is not mandatory any longer just because you
have specified the --with-tokendb option.
(configure*, session.c)

Removed the sql.c file.
(sql.c)

Log the time of virus checking.
(avir.c)

2009.02.01, SJ

Modified the webui authentication. Now it uses its own
method instead of relying on the webserver. Currently
only the regular htpasswd format password file is supported.
Plans to add ldap and mysql authentication support, too.
(webui/*)

2009.01.29, SJ

Fixed a bug in the clearhash() function.
(hash.c)

Enhanced the training stuff.
(session.c, users.c)

2009.01.28, SJ

Removed the unnecessary reference to 'UE' from user.c
causing compilation problem if specified --userdb=sqlite3

(users.c)



0.4.0:

2009.01.24, SJ

Fixed a whitelist bug.
(misc.c)

2009.01.22, SJ

Extensive code refactoring in session.c

Added whitelist support for LDAP userdb
(users.c)

Fixed a bug adding users to LDAP database in the webui
(webui/ldap.php)

2009.01.20, SJ

Continued extensive code refactoring.
(session.c, bayes.*, sql.*, mysq.c, sqlite3.c, mydb.*, users.*, policy.*, spamdrop.c, test.c)

Antivirus stuff went to avir.c
(session.c, avir.c)

2009.01.17, SJ

Additional fixes to the webui.
(webui/*)

2009.01.16, SJ

Added some sqlite3 support for the web ui.
(webui/mysql.php, webui/config.in.php, webui/.htdb.php, webui/sqlite3.php)

Renamed the .inc files as .php files, and changed the "<?" PHP mark to "<?php"
(webui/*)
(credits: Mark Feenstra)

2009.01.15, SJ

Fixed a bug around syslog()'ing the sender of the whitelist passed message.
(session.c)

Disabled the AVG supporting code in the configure script.
(configure*)

Added back the merged group type support for mysql
(bayes.c, mysql.c, sqlite3.c)

2009.01.14, SJ

Additional code cleanup. The XForward support is enabled by default.
(session.c, sql.c, sql.h, misc.c, misc.h, hash2.c)

2009.01.13, SJ

Code cleanup.
(bayes.c, cfg.h, cfg.c, example.conf)

2009.01.12, SJ

Fixed a bug, causing unnecessary header lines appearing in the
token list.
(parser.c)

Removed the old token format support.
(Makefile.in, configure*)

Fixed a bug in training with the MySQL/SQLite3 backends.
(bayes.c)

Simplified AV-interfaces.
(av.h, avast.c, avg.c, clad.c, drweb.c, kav.c, session.c)

Added a Hungarian language virus notification template.
(templates/template.virus-hu)



0.4.0-rc2:

2009.01.08, SJ

Even more serious code refactoring. I modified the SQL statement to
query all the needed tokens at once. It should result a faster SQL
lookup.
(lots of *.h and *.c)

2009.01.04, SJ

Code refactoring: simplified structures.
(lots of *.h and *.c)


2009.01.02, SJ

Removed the holddir configuration parameter.
(session.c, cfg.c, cfg.h, example.conf)

2008.12.29, SJ

Fill automatically the uid field when adding a new user.
(webui/ldap.php, webui/mysql.php, webui/users.php)



0.4.0-rc1:

2008.12.15, SJ

Fixed a bug in the example ldif file.
(ldap/example.ldif)

2008.12.14, SJ

Web ui fixes.
(webui/mysql.php, webui/ldap.php, webui/users.php)

2008.12.13, SJ

Removed the cgi utilities since the web ui is a nicer implementation.
(*cgi.c, Makefile.in, configure*)

Added training support to the web ui.
(webui/config.in.php, webui/q.php)

The mail administrators may edit the per user whitelists as well
as the global whitelist.
(webui/mysql.php, webui/ldap.php)

Removed some unused configuration parameters from example.conf
(example.conf, cfg.c, cfg.h)


2008.12.12, SJ

Mydb modifications: added a qcache daemon to enable the clapf
daemon to query tokens from the daemon.
Please note that this is an experimental feature, and
subject to changes in the future.
(session.c, xdb.c, mydb.h)

2008.12.10, SJ

Added the anti-virus engine info to the notification.
(templates.c, templates.g, session.c, templates/template.virus)

2008.12.07, SJ

Working user-, alias- and policy handling with both
LDAP and MySQL backend.
(webui/*)

2008.12.04, SJ

Template based virus notifications.
(templates/, templates.c, templates.h)

2008.11.27, SJ

Modified the t_policy table to use tinyint types for storing
boolean (0 or 1) values to use less space.
(policy.sql)

LDAP version of the policy storage is available. Add the
ldap/clapf-policy.schema file to your openldap schemas.
For an example ldif file, see the ldap/policy.ldif.

Please note that I have added an attribute called
policyGroupId to qmail.schema to store the policy_group
parameter for the given user.

(policy.c, policy.h, session.c, ldap/clapf-policy.schema)

Remove the message from quarantine if the user releases
the message.
(webui/q.php)

2008.11.26, SJ

An early bird policy is available. Please extend the user
table as follows: ALTER TABLE user ADD COLUMN policy_group int(4) DEFAULT 0;

The default policy group (0) is just the same, as the configuration
you have in clapf.conf. Please see the policy.sql for list, what you
can override on a per user or per group basis.

Currently the policy stuff can be in mysql (the same place where your
userdb is). An LDAP version is to come soon. Then the web ui will be
updated, let the administrators set different policies.
(policy.c, policy.h, policy.sql, session.c, configure*)

Added a new clapf.conf variable called "deliver_infected_email".
If enabled (1) - disabled (0) by default - then clapf will deliver
the malware to the user. Enable it with caution if you really need
it! Of course, it can be enabled on a per user basis, so you can
have all the malware and still your users can be safe.
(cfg.c, cfg.h)

(credits: Miklos Toldi)

2008.11.25, SJ

Initial steps for policy support.
(policy.c, policy.h, session.c)

2008.11.24, SJ

Fixed a bug in the parser causing false junk character
situation resulting possible false positives.
(parser.c)

You don't have to set debug=1 in a separate config
file for spamtest any longer.
(test.c)

2008.11.22, SJ

Added an uninstall option to the Makefile
(Makefile.in)

(credits: Miklos Toldi)

2008.11.21, SJ

Do not let the quarantine scripts modify the From:
line of the released messages.
(webui/funcs.php)

The webui config is updated according to the configure options.
(Makefile.in, webui/config.in.php)

Fixed a naming issue in example.conf/clapf.conf
(example.conf)

2008.11.19, SJ

Added Hungarian web UI translation.
(webui/lang/hu/messages.inc)

(credits: Miklos Toldi)


2008.11.15, SJ

Renamed the 'queuedir' to 'holddir' and it may be set
to $(localstatedir)/hold by the Makefile.
(Makefile.in, example.conf)

A few language fixes in the webui
(webui/q.php, webui/users.php, webui/lang/en/messages.inc)

(credits: Miklos Toldi)

*****************

0.3.31:

2008.11.11, SJ

Lots of minor fixes and clarifications.
(credits: tompos)

2008.11.04, SJ

Added the web UI, a set of PHP scripts replacing the
CGI utilities.
(webui/)

2008.10.25, SJ

Redesigned the username/uid vs. email address query.
You must specify the --with-userdb option. The ldap
support is not fully functional yet. The clapf.schema
must be fixed.
(users.c, users.h, clapf.h, sql.c, sql.h, configure*, ...)

2008.10.23, SJ

Thinking about the problem I realised that I still need the
t_white_list table. It holds all the user's whitelisted addresses
in one blob per user.
(db-*.sql)

2008.10.20, SJ

Modified whitelist support. I changed the 'user' table to hold
a new column (whitelist) with type blob. Now the whitelist column
stores all the whitelisted email addresses for the given uid one
email address in a line, ie. separated by a new line character.
The t_white_list table is no longer supported or used.
(sql.c, sql.h, misc.c, misc.h, config.h, db-new.sql)

2008.10.13, SJ

If you are running clapf in daemon mode, you can forward messages to
be trained to ham@yourdomain and spam@yourdomain too. Not just to the
ordinary user+ham@ and user+spam@ addresses.
(session.c)

2008.09.29, SJ

Implemented a primitive token reassembly logic to get 'Cialis'
from 'C i a l i s'. To use it, set the locale variable in clapf.conf.
(misc.c, misc.h, cfg.c, cfg.h, parser.c)
2008.09.17, SJ

Fix a bug preventing to decode base64 encoded textual parts.
(parser.c)

Modified the decoder functions.
(decoder.c)

Fixed a bug using the rbl_list_check() function.
(bayes.c)

0.3.31-rc3:

2008.09.08, SJ

Fixed a bug in the parser preventing it to detect junk characters.
(misc.c, misc.h, parser.c)

2008.08.27, SJ

More fixes from Chris.
(example.conf, util/db_init_*.in)
(creadits: Christoph Wilke)

2008.08.26, SJ

Various bug fixes by Chris.
(spamdrop.c, lang.c)
(creadits: Christoph Wilke)

2008.08.24, SJ

Updated the libclamav related stuff to match the latest 0.93.3 release of clamav.
(clapf.c, session.c)

2008.08.22, SJ

Removed the -DDEBUG flag from the Makefile, and added
the "debug" configure option. If you set it to "1", it will print
a lots of debug info to the stderr. Turn it only if using the spamtest
utility, all the other utilities (clapf, spamdrop, ...) needs it
turned off (set to "0").

(cfg.c, cfg.h, example.c, bayes.c, score.c, chi.c, parser.c, rbl.c, black.c, Makefile.in)
(creadits: Christoph Wilke)

2008.08.20, SJ

Fixed the sqlite3 part of the is_sender_on_whitelist() function.
(sql.c)

2008.08.18, SJ

Modified the parser to use a delimiter character set table. You may
decide whether you want "junk" characters to be transformed to 'j'
by setting the replace_junk_characters configuration variable.
(parser.c, misc.c, misc.h, decoder.c, trans.h, cfg.c, cfg.h, example.conf)

2008.08.01, SJ

Add a 'TUM' header entry in case of blackhole training.
(spamdrop.c)

2008.07.30, SJ

Do not TUM train if it's a blackhole message.
(spamdrop.c)

2008.07.25, SJ

Added missing mysql_close() to spamdrop.c
(spamdrop.c)

Do not bother mysql with the uid field if it is 0 in update_mysql_tokens()
(mysql.c)

2008.07.22, SJ

Added a new function to grab long FQDNs in the Received: lines
otherwise lost.
(misc.c, misc.h, parser.c)

2008.07.21, SJ

Fixed a bug preventing spamdrop to print its extra header info
in case of a message without subject line and without body.
(spamdrop.c)

2008.07.17, SJ

Unified spam quarantine cgi menu. QP decoder fix.
(*cgi.c, cfg.c, cfg.h, messages.h*, Makefile.in, decoder.c)

2008.07.14, SJ

Modified the qp_decode() function to replace a non-printable character
only if it's not in the translation table.
(decoder.c)

2008.07.12, SJ

The qp_decode() function replaces every non-printable charater with
JUNK_REPLACEMENT_CHAR.
(decoder.c)

2008.06.30, SJ

Removed my version of case insensitive strstr() and replaced with its
GNU version, strcasestr(). Under Linux, it's necessary to include the
-D_GNU_SOURCE macro.
(misc.c, misc.h, session.c, parser.c, spamdrop.c, clamd.c, mime.c)

Extended the whitelist feature. It lets you to define "wildcards",
such as @google.com, xxx@google.com, .google.com, etc.
Unfortunately we have to do a full table scan.
(sql.c)


0.3.31-rc2:

2008.06.28, SJ

Enhanced the iterative training by stop further training if the message
is correctly identified as ham (or spam).

The default character encoding (if it's not specified) is now set to iso-8859-2.
(parser.c)

2008.06.25, SJ

Fixed a bug in the clapf daemon (the spamdrop utility is not
affected) which prevented training on the forwarded emails.
(session.c)
(credits: efpe)

2008.06.13, SJ

If the message looks as a legitimate bounce, then set its spamicity
to 0.5.
(session.c, spamdrop.c)

2008.06.11, SJ

After mime attachment reset some state variables when we encounter
an empty line, ie. attachment is over.
(parser.c)

2008.06.05, SJ

Set sdata.uid=0 if group_type=0.
(spamdrop.c)

2008.06.02, SJ

Simplified bayes_file() definition.
(bayes.c, bayes.h, session.c, spamdrop.c, test.c, tune.c)

2008.05.26, SJ

Modified the spamdrop utility to let it handle blackhole training
requests even if no blackhole support was enable at compile time.
Thus only putting/checking IP-addresses in blackhole directory is
enabled by using --enable-blackhole

(spamdrop.c)

Modified the message parser to create a chained list from URLs.
We can use this instead the urlhash[] hash array.

(parser.c, parser.h, list.c, list.h, bayes.c, spamdrop.c, session.c, test.c, Makefile.in, configure*)

2008.05.18, SJ

Simplified DATA-state related initialisation stuff.

(session.c, parser.c, parser.h, bayes.c)

Assign 0.99 as spamicity if we cannot see our signo in a bounce.
This should eliminate auto-learn. Please note that some appliances
may remove our signo from a bounce email.

(session.c, spamdrop.c)

2008.05.14, SJ

Searchable spam quarantine. You can search for sender name/address and
subject.
(spamcgi.c, cgi.c, cgi.h)

2008.05.12, SJ

Fixed the is_recipient_in_our_domains() function to include
the '@' sign.
(misc.c)

2008.05.11, SJ

Return the number of ham/spam emails while using mydb storage.
(parser.h, bayes.c, bayes.h, spamdrop.c, mydb.c, mydb.h, mydb_stat.c)

Mark fake bounces (backscatter) as spam if our signature is enabled.
(spamdrop.c)

0.3.31-rc1:

2008.05.07, SJ

Now both the clapf daemon and the spamdrop utility is able to detect
backscatter spam emails. This feature is still experimental.
(spamdrop.c, parser.h. parser.c, bayes.h, bayes.c)

2008.05.05, SJ

Added the mydomains parameter. The clapf daemon will not insert the
anti backscatter header (our_signo in clapf.conf) if the recipient
is in mydomains.

So to use the anti-backscatter help feature, set the following in clapf.conf:

mydomains=@yourdomain1,@yourdomain2,...
our_signo=X-Anti-Backscatter: abcde123

(smtp.c, misc.c, misc.h, cfg.c, cfg.h)

2008.05.01, SJ

Modified the parser routine to update the state variable using the "->" operator.
(parser.c, parser.h, bayes.c, parsembox.c)

2008.04.29, SJ

Incorporated the spamsum project (http://www.samba.org/ftp/unpacked/junkcode/spamsum/)
to help the decision if the statistical module is uncertain. The status of the plugin
is experimental, but working.
You may use the $(libexecdir)/clapf/spamsum utility to create the spam sums, eg.
$(libexecdir)/clapf/spamsum messagefile1 messagefile2 ... messagefileN

To use the signatures, set the sig_db parameter in clapf.conf

Please note that the spamsum project is the work of Andrew Tridgell.
(spamdrop.c, contrib/spamsum/*)

2008.04.28, SJ

Changed the move_message_to_quarantine() function to use link()
instead of rename(), thus we get no more failed to remove syslog
messages.
(misc.c)

2008.04.25, SJ

Added an option to place an extra header field to help us figthing
backscatter spam emails. Set the our_signo parameter in clapf.conf
to enable this feature.
(cfg.c, cfg.h, smtp.c, example.conf)

Fixed a bug around whitelisting
(session.c)
(credits: Zoltan Fried)

2008.04.23, SJ

Modified the mydb compression tool to leave alone the token database
if it's size is under a certain limit (look MYDB_MIN_SIZE in config.h,
by default 1 MB).
(mydb_compress.c, config.h)

2008.04.17, SJ

Fixed the spamdrop_helper feature.
(spamdrop.c, util/spamdrop_helper.in)

Increased the size of "s" buffer in base64_decode().
(decoder.c)

2008.04.14, SJ

Add a default clapf header to the email if its size is above the
max_message_size limit.
(session.c)

2008.04.13, SJ

Pid file fix in clapf.c, Purge script fixes around the t_queue table.

(credits: Gabor Adorjani)

2008.04.12, SJ

Added a spamdrop helper program. This enables spamdrop to
create the necessary per user data- and queue directories
if necessary. To enable this feature, use the --enable-spamdrop-helper
configure option.
If using this feature, make sure that the user running
spamdrop actually has proper permissions to create those
directories.
You may customise the util/spamdrop_helper.in script, eg.
to copy a default token database for the new user, etc.
It's importatnt that this script should produce no (error)
messages!

(credits: Peter)

2008.04.10, SJ

Added info and scripts about the language detection
to the contrib directory.
(lang.h, contrib/i18n/)

2008.03.18, SJ

Added an experimental language detection.
(lang.c, lang.h, spamdrop.c, test.c, clapf.h)

2008.03.12, SJ

Missed the mysql part of the training.
(bayes.c, sql.c)

2008.03.10, SJ

username can be specified on the command line with the -u <username> option.
(spamdrop.c)

2008.03.09, SJ

The SQLite3 database file can be given by the sqlite3 parameter in clapf.conf,
or I will try ${localstatedir}/lib/clapf/data/x/xx/clapf.sdb
(spamdrop.c)


0.3.30:

2008.03.07, SJ

Some clarifications about the training in the docs.
(credits: Pete)

iso-8859-2 enconding is the default format.
(parser.c)

2008.03.06, SJ

Added a new configure option (--enable-static-build) to support
easy static builds.

Incorporated the ${prefix}/bin path to the training scripts.

A few documentation enhancements.

(credits: Pete)

2008.03.05, SJ

Fixed a buffer overflow bug in rbl_list_check() function.
(rbl.c)

2008.03.03, SJ

Removed the "unnecessary" mysql check at configure time.
(configure.*)

2008.03.01, SJ

Fixed a bug during the exec time of spam check if there's
no actual spam check.
(spamdrop.c)

2008.02.26, SJ

Mydb fix.
(mydb.c)

Spamdrop now honors the TUM training while training on the
command line.
(parser.h, parser.c, bayes.c, spamdrop.c)

Installing the man pages went to another section.
(Makefile.in)

2008.02.25, SJ

Modified compilation. You can use both the libclapf.a and libclapf.so* files.
(Makefile.in, configure*, spamdrop.c)

2008.02.23, SJ

Removed a few header files, and consolidated the anti-virus definitions.

2008.02.19, SJ

Fixed a training bug. The traincgi utility warned about "No message found"
message when training from the web page.
(traincgi.c)

Fixed a signedness bug in mydb_compress.c
(mydb_compress.c)

(credits: Kevin Lewis)

2008.02.15, SJ

Added a "User List" menu to the administrator in the spam quarantine.
(spamcgi.c, messages.h, messages.h.hu)
(credits: Kevin Lewis)

2008.02.14, SJ

Removed the passmail stuff and fixed a possible spamcgi bug.
(misc.c, misc.h, spamcgi.c, Makefile.in, configure*)

Fixed a bug causing to deliver a message to the admin user and
not to the real recipient.
(user.c, traincgi.c, spamcgi.c)

2008.02.13, SJ

Added an administrator user to the spam quarantine who can access
everybody's spam emails. To use this feature, set the admin_user
parameter in clapf.conf.
(example.conf, cfg.c, cfg.h, cgi.c, cgi.h, spamcgi.c)

Removed the threaded version of clapf, it has some problems...

2008.02.08, SJ

The threaded version of clapf supports only mysql database currently.
(clapfthread.c)

2008.02.07, SJ

Added a threaded version of clapf.
(clapfthread.c, session.c)

The util/check_clapf.sh script honors the CLAPF_USER parameter in Makefile.
(Makefile.in, util/check_clapf.sh.in)

2008.02.04, SJ

Removed the rude_surbl directive. One hit is enough to trigger
the spammy result.
(cfg.h, cfg.c, score.c, example.conf)

If you want to present all the IP-addresses from the "Received: from"
lines to blacklists, then use the --enable-all-received configure
option. The default is to check only the host passing the email to us.
(parser.c, configure.in, configure)

2008.01.28, SJ

Added white list support. If the sender is on the list, then
no spam check performed (except blackhole) and return spamicity=0.5.

If you want to use this feature use --enable-whitelist and please
update your SQL tables with the following sql statements:

create table if not exists t_white_list (
        email char(64) primary key not null,
        uid int default 0
);

create index t_white_list_idx on t_white_list (email);

(session.c, spamdrop.c, sql.c, sql.h, configure*)

2008.01.27, SJ

A strange problem may occur at a specific condition: clapf may send
two period(.) commands while passing the message back to postfix, so
postfix replies with a 250...500 message. Anyway the message gets
delivered. I fixed the problem.
(smtp.c, spamdrop.c)

2008.01.23, SJ

This is an extensive modification. I redesigned the spam handling blocks
and made several changes. If you are using the mysql token databases
update your table structure by issuing the following sql commands:

alter table t_token add column timestamp int default 0;
update t_token set timestamp=UNIX_TIMESTAMP();


Redesigned the training process as follows: removed the train utility,
and you can train with the spamdrop command, eg.

to train with a spam message:
spamdrop -c /usr/local/etc/clapf.conf -S < rfc822_format_message

or to train with a ham message:
spamdrop -c /usr/local/etc/clapf.conf -H < rfc822_format_message

Clapf automatically learns an email if:
- initial_1000_learning=1 is set in clapf conf and we have less then
1000 ham or spam
- training_mode=1 is set in clapf.conf and the message is surely spam,
but its spaminess is under 0.99
- training_mode=1 is set in clapf.conf and the message is surely ham,
but its spaminess is above 0.1
- the message came to the blackhole(aka. minefield address), but its
spaminess is under 0.99
 
Now all the token backends stores (and updates) the timestamp of
the given token. This gives you the ability to remove any token
which is no longer used to free storage space.
You may use the util/purge-mysql.sql SQL script if using MySQL,
or the util/purge-sqlite3.sh with SQLite3.

These scripts remove any token if it's:
- not used in the last 90 days
- not used in the last 60 days and 2*nham + nspam < 5
- not used in the last 15 days and nham+nspam = 1

This helps to keep the database size in the required minimum size
while preserving accuracy.

2008.01.20, SJ

Modified the SQLite3 stuff, to handle timestamps, suitable for removing
obsoleted tokens later. And it SQLite3 database allows only uid=0 tokens.
So if you need multiple user ids, you need a single SQLite3 database per
users or switch to MySQL. Please recreate your token database from scratch
to use the new features.
(configure.in, configure, db-sqlite3.sql, example.conf, prepare-sql.c, sql.c, sql.h, sqlite3.c, train.c, bayes.c)

Modified the APHash() function to cut in half the available numeric
address space if using SQLite3. Clapf stores the token values as
an integer instead a text. This also means that you have to recreate
your token database.
(config.h, db-sqlite3.sql, misc.c, bayes.c, prepare-sql.c, sqlite3.c, qs.c)

2008.01.13, SJ

Spamdrop does not want to bother with the queue directory if
we don't want to store the message for later retraining.
(spamdrop.c)

2008.01.11, SJ

Messages in the spam quarantine can be reached by clicking on their
serial number on the left side.
(spamcgi.c)

2008.01.10, SJ

To enable any rbl or url bl, use --enable-rbl instead --enable-surbl.
(configure, configure.in)

2008.01.08, SJ

The spaminess_of_text_and_base64 feature is disabled by default. Enable it
if you want to mark base64 encoded _textual_ messages as spam, by setting
spaminess_of_text_and_base64=0.9996 in clapf.conf
(cfg.c, example.conf)

Well, the inverse chi square algorithm should use not the most interesting
top15 tokens, but all the tokens above a certain significance. This can be
adjusted by the exclusion_radius. If you set it to 0.375, then we will include
all the tokens with spamicity 0...0.125 or 0.875...1.000.
So please update your exclusion_radius parameter, and set it to somewhere 0.375.
(hash.c, hash.h, bayes.c, cfg.c, config.h, example.conf)
(credits: Gary Robinson)

2008.01.03, SJ

Commented out the cdb making stuff in db_init_mysql.sh
(util/db_init_mysql.sh.in)
(credits: Zoltan Fried)

0.3.30-rc2:

2007.12.27, SJ

Added a new variable store_only_spam which enables you to
store only the spam messages in the queue directories.
To enable this, set store_only_spam=1 as well as store_metadata=1
This is available in daemon mode not in spamdrop.
(cfg.c, cfg.h, session.c, example.conf)

2007.12.26, SJ

SQLite3 pragma can be set by configuration file
(cfg.c, cfg.h, session.c, spamdrop.c, test.c, example.conf)

2007.12.22, SJ

Some (my)sql related fixes.
(bayes.c, session.c, spamdrop.c, Makefile.in)

Blackhole improvements
(session.c, spamdrop.c, example.conf)

2007.12.16, SJ

Added an SQLite3 fix.
(sqlite3.c)

Added a configure option suitable for outgoing smtp servers.
(session.c, misc.c, misc.h, configure.*)

2007.12.13, SJ

Modified the make_rnd_string() function to include the timestamp in
the first 4 bytes.
(misc.c)

Added SQLite3 pragma. See "#define SQLITE3_PRAGMA" in config.h
(session.c, spamdrop.c, test.c)

2007.12.06, SJ

Add a spammy token if the client is unknown (ie. without a valid PTR record).
XFORWARD support must be activated for this to work.
(session.c, bayes.c, parser.h, parser.c)

2007.12.01, SJ

Added a new configure option (--with-clapf-user) to specify the user who runs
the clapf daemon. If this user (by default "clapf" does not exists, the configure
script will exit.
(configure.in, configure)

Changed the directories in the example configuration file to more reasonable
defaults.
(Makefile.in, example.conf)
(credits: Gabor Garami)

2007.11.26, SJ

Minor fix in mydb creation.
(mydb.c)

2007.11.23, SJ

Decrement the opposite counter in case of a TUM train.
(mydb.c)

2007.11.07, SJ

Added an option (mysql_connect_timeout) to set the MySQL connection timeout.
(cfg.c, cfg.h, session.c, spamdrop.c, example.conf)

2007.11.06, SJ

Added XFORWARD extension support.
(parser.h, smtpcodes.h, session.c)

Modified the spamicity calculation, to give token pairs a greater strength
(mydb.c, misc.h, misc.c, sql.c)

2007.11.05, SJ

Added an fsync() syscall to check successful write to disk of queue/temp file
(session.c, spamdrop.c, smtpcodes.h)

2007.11.04, SJ

Added a blackhole/minefield time check
(black.c)

2007.10.24, SJ

Fixed a bug while retraining blackholed messages.
(bayes.c)

2007.10.23, SJ

Added an option to toggle the phishing check in libclamav.
(cfg.c, cfg.h, example.conf, session.c, clapf.c)

2007.10.19, SJ

Changed the t_token and t_queue table types to InnoDB.
If you want to stick to MyISAM, then remove the Engine
specification.
(db-new.sql, db-old.sql)

2007.10.18, SJ

Fixed a bug in spamdrop preventing forwarded messages to be trained.
(spamdrop.c)

Fixed a bug preventing missed blackhole messages to be trained.
(parser.h, session.c, spamdrop.c, bayes.c)

2007.10.17, SJ

Changed qcache behaviour to be a FIFO cache.
(qcache.c)
 
2007.10.15, SJ

Polishing log levels. 1: normal, 3: info, 5: debug
(session.c, bayes.c, config.h)

2007.10.11, SJ

Treat iso-8859-1 messages as iso-8859-2.
(parser.c)

Fixed a username by using mydb.
(spamdrop.c)

0.3.30-rc1:

2007.10.10, SJ

Moved TUM training out of the bayes_file() function.

Training by forwarding email is possible without the maildrop utility.
(bayes.c, bayes.h, session.c, spamdrop.c)

Finally solved the virtual users problem.

Blackhole (aka. minefield) is handled by spamdrop.
(spamdrop.c)

2007.10.09, SJ

Removed the CDB stuff. This database is unsupported from now.
(bayes.c, session.c, bayes.h)

Added a new database type, called mydb
(bayes.c, mydb.c, mydb.h)

2007.10.05, SJ

Revised training and message storage.
(sql.h, sql.c, spam-fwd-train.c, session.c, doc/html/install.html, doc/html/virtual.html)

Integrated retraining feature into the clapf daemon.
(session.c, bayes.c, sql.c)

2007.10.04, SJ

Fixed a bug if the user id is not found in the user table.
(bayes.c)
(credits: Salman Husni)

2007.10.01, SJ

Added an option to store temporary files under ~/.clapf otherwise
use ${prefix}/var/clapf/u/username. For now the homedir version is
the default.
(configure*, clapf-config.h.in, config.h, spamdrop.c)

Fixed a bug storing messages in sqlite database.
(sql.c)

Reverting the storage of messages. Instead of putting them to the
SQL table, they are stored in a per user directory. Note: the
virtual user support must be polished and verified.
(spamdrop.c, spam-fwd-train.c)

2007.09.29, SJ

Fixed a bug in spam-train-fwd.c to prevent training inside bayes_file(),
and disable (su)rbl checks in order to not to interfere.
(spam-fwd-train.c)

2007.09.25, SJ

Now we use a simple definition of a message start in the mbox file:
From_ (aka FromSPACE).
(parsembox.c, splitmbox.c)

Added an extra field to determine whether a message in t_queue is
spam or not.

2007.09.24, SJ

Do not perform RBL lookup + single tokens checking if the message has no
no subject tokens unless it has <10 body tokens.
(bayes.c)

Adding some new documentation.
(doc/html/group.html, doc/html/install.html)

2007.09.21, SJ

Fixed training scripts.
(clapf_admin.c, db-*sql, util/db_init*, doc/html/training.html)

2007.09.20, SJ

Added an autolearning feature. Set initial_1000_learning=1 in clapf.conf
if you do not want to perform an initial training for any reason.
(config.h, cfg.c, cfg.h, bayes.c, example.conf)

Fixed a bug preventing clapf to learn unknown tokens.
(sql.c)

2007.09.18, SJ

Removed the unnecessary reference to the openssl/* header file.
(misc.h)

Issue a NO_SUBJECT spammy token if we have no valuable subject token
(bayes.c)

Fixing HTML documentation.
(doc/html/*)

Moved the TUM training stuff to the bayes_file() function.
(bayes.c)

2007.09.16, SJ

Move the spaminess_of_text_and_base64 checking stuff to the 'unsure' zone.
(bayes.c)

2007.09.15, SJ

If we have no valid subject tokens, let's ask the single tokens
which means we do some rbl tests as well. This may help against
some micro spams.
(parser.c, parser.h, bayes.c)

2007.09.14, SJ

Prevent several TUM loops with iterative training.
(spam-fwd-train.c)

Modofied process_syslog.pl to recognise the spamdrop
stuff in the mail log.
(stat/process_syslog.pl)

2007.09.11, SJ

If the message is not good enough and found on a blacklist, mark it as spam.
You may disable this feature by commenting the 'rbl_domain' option out.
(bayes.c)

We can handle more than 1 blacklist, just as with the SURBL domains.
(rbl.h, rbl.c, bayes.c)

2007.09.10, SJ

Modified the training method: instead forwarding to spam+user@domain,
forward it to user+spam@domain. And similarly: instead forwarding to
ham+user@domain, forward it to user+ham@domain.
This gives you the possibility to user per user SQLite3 token databases.

2007.09.07, SJ

Add a spammy token in case of a RTF attachment.
(bayes.c)

2007.09.06, SJ

Set spamicity to 0.5 for tokens seen not more than twice.
(sql.c)

2007.09.05, SJ

Minor fixes.

2007.09.03, SJ

spamstat.pl can handle spamdrop lines as well.
(stat/spamstat.pl)

2007.09.02, SJ

UTF-8 fix.
(parser.c, parser.h)

2007.08.28, SJ

The max_ham_spamicity variable obsoletes the former max_junk_spamicity
and max_embed_image_spamicity variables. Its purpose is to define a
spam probability limit to ham messages. If a certain message has a
greater probability than this value, some rude anti-spam mesaures
take place. It defaults to max_ham_spamicity=0.41
(cfg.h, cfg.c, spamdrop.c, spam-fwd-train.c, bayes.c)

Improved parsing.
(bayes.c, parser.h, parser.c)

2007.08.26, SJ

Even better configure and Makefile
(configure*, Makefile.in)
(credits: Elso Andras)

CDB related fixes
(bayes.c, bayes.h, session.c)
(credits: Elso Andras)

2007.08.24, SJ

Added pid file support: clapf writes its pid to a file.
(clapf.c, cfg.h, cfg.c)

2007.08.22, SJ

Better configure script and Makefile
(configure*, Makefile.in)
(credits: Elso Andras)

Fixed a makefile bug
(configure*, Makefile.in)

Consisent hostid
(smtpcodes.h, smtp.c, session.c)
(credits: Elso Andras)

Fixed a bug in spamcgi
(spamcgi.c)
(credits: Elso Andras)

2007.08.20. SJ

Added a statistic logging to spamdrop ("<uid> got HAM|SPAM")
(misc.c, session.c, spamdrop.c)

2007.08.10, SJ

Removed the --enable-mysql configure option and the HAVE_MYSQL_TOKEN_DATABASE
definition. Now I use only the #ifdef HAVE_MYSQL option.
(configure.in, configure, session.c, bayes.c, spamdrop.c, train.c, traincgi.c, spam-fwd-train.c)

Added LMTP support. From now clapf will receive messages via LMTP from postfix.
Replace the following in /etc/postfix/main.cf:

content_filter = smtp:[127.0.0.1]:10025

with

content_filter = lmtp:[127.0.0.1]:10025

Please note that I have not implemented the ENHANCEDSTATUSCODE and the
PIPELINING extensions. It seems that postfix is fine without them.
(session.c)

Parsing the message outside the bayes_file() function.
(bayes.c, session.c, spamdrop.c, test.c)

Removed the --disable-antispam option. The configure script will exit
if you do not specify neither an antivirus package nor a tokendb type.
(configure.in, configure)


2007.08.08, SJ

Fixed a bug in the 'train' utility to update the t_misc table
while using an SQLite database.
(train.c)

Ongoing SQLite3 integration
(sql.c, session.c, spamdrop.c, train.c)

2007.08.07, SJ

Fixing uid handling in bayes.c
(bayes.c)

2007.08.04, SJ

Continued to improve spamdrop integration. You may use a single
/etc/maildroprc file to let all users use spamdrop and set up
the two training email addresses (see maildroprc):
(spamdrop.c, sql.c, sql.h, bayes.c, mysql.c)

2007.08.02, SJ

Store metadata even if uid=0.
(mysql.c)

2007.07.22, SJ

Fixed an issue around the Qcache updating while training.
(mysql.c)

2007.07.20, SJ

SQL table names are defined in config.h.
(config.h, example.conf)

Added sqlite3 support for Qcache and spam evaluation.

2007.07.17, SJ

Added the silently_discard_infected_email option that allows you
to decide whether to silently discard the infected message (1)
or let a bounce get back to the sender (0). (credits: Toms Trankalis)
(cfg.c, cfg.h, example.conf, session.c)

Removed the attachment dumping and analysing code
(parser.c, bayes.c, example.conf)

2007.07.15, SJ

Removed the "too much spam in top15" feature.
(hash.c)

Slightly modified spaminess calculation
(hash.c, hash.h, bayes.c)

*********************************************

0.3.29:

2007.07.16, SJ

Fixed a compilation issue on FreeBSD (credits: Toms Trankalis)
(cgi.c)

Fixed a configure issue.
(configure.*, Makefile.in)

2007.07.11, SJ

Added iterative training to the email forwarder trainer. It trains the
email until it classifies it correctly (but max. 5 times, see the
MAX_ITERATIVE_TRAIN_LOOPS parameter in config.h)
(misc.c, misc.h, spam-fwd-train.c, config.h)

Minor fix in the Qcache daemon.
(qs.c)

2007.07.10, SJ

The parser discards any header line except a few such as Received, From,
To, Subject, Content-type, Content-Transfer-Encoding
(parser.c)

2007.07.09, SJ

Added a query cache. If you want to retrieve lots of tokens from a MySQL
table, it might take x * 100 ms. Though it's usually acceptable for a
small network, a query cache might improve the performance. MySQL has an
internal query cache. Unfortunately MySQL discards it if the t_token table
is written (update/delete/insert/...) which occurs frequently if you choose
the Training Until Mature (TUM) trainig mode.

So I decided to write my own version of query cache. The "qs" utility is able
to listen on both TCP and Unix domain socket. It handles all the select and
update requests aimed at the t_token table. It's important that it supports
only the new hashed format tokens.

If you want to use it, adjust the following parameters:

qcache_addr=127.0.0.1
qcache_port=48791

or

qcache_socket=/tmp/qcache

Qcache syslog()'s the achieved cache hit rate so you can track if it's really
useful for you.

Please note that this feature is still experimental. All comments are welcome.

(qcache.c, qcache.h, qs.c, bayes.c, mysql.c, cfg.c, cfg.h, example.conf)

Added an option for chroot() environments and clamd. If you run clapf in a
chroot()'ed environment you might find it painful to run clamd in the chroot
environment too (because clapf passes the filename as it sees, eg. /opt/av/tmp/xxxxxxxxx).

Now you are good to to run clamd outside the chroot, just set the chrootdir
parameter in clapf.conf pointing to the chroot path.

if your chroot / is /myjail then set:

chrootdir=/myjail

(clamd.c, clamd.h, session.c, cfg.c, cfg.h, example.conf)

2007.07.06, SJ

Eliminated multiple mysql_real_connect() calls in the same session.
(session.c, bayes.c, mysql.c)

Modified the spamdrop utility to use only the getuid() function
to get the user id of the recipient.
(spamdrop.c)

Fixed a bug in splitmbox.
(splitmbox.c)

2007.07.03, SJ

Removed hash_db support
(cfg.h, cfg.c, clapf.c, bayes.c, example.conf)

2007.07.02, SJ

Added an option to instantiate a spamy token for octet-stream attachments.
It may help you to fight PDF spam.
(example.conf, cfg.h. cfg.c, parser.h, parser.c, bayes.c)

2007.06.28, SJ

Adding \r\n at the end of every line. Qmail really needs this.
http://pobox.com/~djb/docs/smtplf.html
(splitmbox.c)

2007.06.26, SJ

Fixed spamdrop to be able to read message through pipe, thus
can be used in .mailfilter in local delivery mode.
(spamdrop.c)

Messages are saved to mysql table instead the queue directory.
Do not forget to periodically purge aged messages from the t_queue
table, eg.
echo "delete t_queue where UNIX_TIMESTAMP() - ts > 604800" | mysql


(db-new.sql, db-old.sql, mysql.c, spamdrop.c, spam-fwd-train.c, crtab)

2007.06.21, SJ

Added a white list option: if the sender of the email (From: ...)
has sent us at least 10 good emails and 0 spam let his email
come without statistical analysis.
(bayes.c)

2007.06.16, SJ

Fixed a typo in bayes.c
(bayes.c)

Fixed potential underflow when while corrective TUM training.
(mysql.c)

2007.06.15, SJ

Fixed a bug around assembling the extra header for insertion.
(session.c)

2007.06.14, SJ

Added TUM support for the training stuff.
(mysql.c, train.c, traincgi.c, misc.h, bayes.c, session.c)

2007.06.13, SJ

Added merged group support. Now both shared and merged groups are available.
Added TUM (Train Until Mature) training mode. Now both TOE (Train On Error)
and TUM mode is supported.
(mysql.c, bayes.c, cfg.c, cfg.h, example.conf)

2007.06.09, SJ

Updated the parser not to chain between individual header lines.
(parser.c, parser.h)

2007.06.07, SJ

Updated the parser routine to generate token pairs too. Thus
simplified the training utilities.
(parser.c, parser.h, train.c, traincgi.c, spam-fwd-train.c)

Removed the unnecessary ptable.h file.

Added html decoder function.
(decoder.c, decoder.h, parser.c)

2007.06.01, SJ

Fixed a bug around the spamicity calculation
(bayes.c)

The configure script now honors the --sysconfdir option,
and you don't have to specify the -c /path/to/clapf.conf
if its default path is fine for you.
(configure*, clapf-config.h.in, bayes.c)
(credits: Johannes Russek)

Added a command line switch (-d) allowing clapf going to
the background.
(clapf.c, config.h)
(credits: Johannes Russek)

2007.05.30, SJ

Fixed some compile time issues in case of a setup without
the antispam stuff.
(credits: Johannes Russek)

Introduced the APHash() function. It creates a 2^64 bits version of
the token strings in the mysql table resulting a much smaller table.
See http://www.partow.net/programming/hashfunctions/#APHashFunction
for more details on this hashing function.
(misc.c, mysql.c, cdb.c, db.sql, util/kcdb_mysql.sh)

2007.05.29, SJ

Tokens are mysql escaped.
(mysql.c)

2007.05.27, SJ

Fixed a bug in the spam-fwd-utility, and added some logging too.
(spam-fwd-train.c)

2007.05.24, SJ

Fixed the same bug in parsembox as in splitmbox
(parsembox.c)

2007.05.23, SJ

"make install" copies the splitmbox utility to $(root)$(DESTDIR)$(bindir)

Fixed a bug in splitmbox.
(splitmbox.c)

2007.05.22, SJ

Fixed the kcdb_mysql.sh script. It did not imported the tokens into
the mysql database without the presence of the cdb utility.
(util/kcdb_mysql.sh)

Better quoted-printable decoding
(parser.c)

2007.05.21, SJ

Fixed logging to mysql table using traincgi.
(traincgi.c)

The "make install" command does not overwrite the existing clapf.conf.
You should review what's new and update your existing configuration file.
(Makefile.in)

Fixed a bug in the training utilities
(mysql.c, train.c, traincgi.c, spam-fwd-train.c)

Cut the number of database lookups half - if possible. It's no use to query
the single tokens from the database if we can classify the message according
to the token phrases. This is (ok, should be) the case after proper training
most of the time.
(bayes.c)

2007.05.19, SJ

Updating documentation
(doc/html/*)

2007.05.18, SJ

Added personal statistics
(statcgi.c, trainlogcgi.c, messages.h*)

2007.05.17, SJ

Added a new training method by forwarding the message to spam+user@domain
or to ham+user@domain.
(spam-fwd-train.c, mysql.c, Makefile.in)

2007.05.15, SJ

Added mysql support as a backend for tokens. It can be used instead
of the cdb backend. Use the --with-tokendb=... configuration option.
(bayes.c, cdb.c, mysql.c)

Do not chain with URLs
(bayes.c)

2007.05.14, SJ

Modified the kcdb.sh and kcdb_mysql.sh scripts to handle Maildir format mboxes too.
Removed the maildir.sh script.
(util/kcdb.sh, util/kcdb_mysql.sh)

Removed the perl/mbox.pl script. Not used.

2007.05.04, SJ

Degenerate tokens if they end with punctuation
(misc.c, misc.h, parser.c)

0.3.29-rc2:

2007.05.02, SJ

Created a shell script to add users to the spam quarantine.
(util/user2sq.sh) 

2007.04.27, SJ

Use the single token hash (shash) if we have very few phrases
(bayes.c)

2007.04.24, SJ

Removed trailing space from skipped header definitions.
(shdr.h)

2007.04.19, SJ

Fixed a 'Content-Type' issue.
(parser.c)

TRAINING/tokens.sql and user.sql merged into db.sql
(db.sql)

2007.04.18, SJ

Removed the unnecessary <openssl/evp.h> header file inclusion
(decoder.c)

Fixed compile time warnings about signed-unsigned differences
(misc.h, misc.c, parser.c)

2007.04.05, SJ

Added an extra log entry to collect per user/email address ham/spam statistics.
(misc.c, misc.h, session.c)

2007.04.03, SJ

Do not count invalid junk characters in the "To:" lines. I got
some emails containing Czech and Romanian names with unusual
letters - found in the invalid_junk_characters[] array - causing
false positives problems.
(parser.c)

2007.03.20, SJ

Minor fix in the graph shell script.
(stat/clapf-rrd-graph.sh)

2007.03.19, SJ

A little bit nicer spam quarantine. A cosmetic only change.
(spamcgi.c, trainlogcgi.c, usercgi.c, doc.html/style.css)


0.3.29-rc1:

2007.03.07, SJ

The spamcgi utility limits the length of the subject to MAX_CGI_SUBJECT_LEN (see config.h)
(spamcgi.c, config.h)

2007.03.06, SJ

The splitmbox utility terminates line with CR-LF (\r\n).
(splitmbox.c)

Check for CR-LF at the end of the mail header in the black.pl script
and fixed the RFC-822 date string.
(blackhole/black.pl)

2007.03.03, SJ

Added the "X-Keywords" header field to the exclusion list.
(shdr.h)

The skip_headers[] list is case insensitive now.
(parser.c)

2007.02.27, SJ

The exclude_unknown_tokens configuration parameter has been removed and
introduced the exclusion_radius parameter that can be used to eliminate
neutral tokens.
(cfg.h, cfg.c, example.conf, bayes.c)

Modified the parser routine to include DSN report attachments.
(parser.c)

More header stuff has been discarded. You may customise the excluded header
lines in shdr.h.
(shdr.h)

2007.02.26, SJ

Modified the chi2inv() function according to http://garyrob.blogs.com/chi2p.py
(chi.h. chi.c, hash.c)

2007.02.17, SJ

Gcc optimisation level changed from -O3 to -O2.
(configure.in)

2007.02.07, SJ

Added an option allowing you to reject spam. If you want this to
happen set the spaminess_oblivion_limit variable to a value above
your spam_overall_limit (eg. 0.99) and clapf will drop all the
spam if its spamicity value is >= 0.99. I think it may be useful
on some outgoing smtp servers eliminating most of the outgoing
spam from your own network. Be careful with this setting.
(cfg.h, cfg.c, session.c, example.conf)

2007.02.05, SJ

Modified the syslog messages slightly to become more unified.
(avast.c, avg.c, clamd.c, drweb.c, kav.c, session.c, bayes.c)

2007.02.02, SJ

Added a configuration directive (keep_queue_files) to determine whether to
keep the queue files after processing. It's useful only for debugging. If
you want to keep those files set keep_queue_files=1 otherwise they are
removed (this is the default behaviour).

(cfg.c, cfg.h, session.c, example.conf)

2007.01.16, SJ

Added a date field to the spam quarantine.
(spamcgi.c)

2007.01.10, SJ

Added a new feature to fight embedded image spam. If the message is not good
enough (=its spamicity value is greater than max_embed_image_spamicity) the
message is marked as spam. This way messages from strangers with images may
land in your spam folder.
(cfg.c, cfg.h, bayes.c, examples.conf)


0.3.28:

2007.01.08, SJ

Modified the shell scripts (util/*) and set the PATH variable.

Removed the clamav database upgrade stuff from the util/ directory.
(util/)

2007.01.07, SJ

The CDB Perl modul is not needed any longer since the cdb utility (part of
the tinycdb package) is able to create cdb files, eg. "cdb -c -m -w tokens.cdb tokens.raw"
(see the cdb (1) man page)

2007.01.04, SJ

Extended the max. length of the tokens (18->19) thus the Content-Disposition
header line also fits.
(config.h, perl/shrink.pl)

2007.01.03, SJ

Added a new variable (penalize_embed_images) to fight image spam aggressively.
If set (penalize_embed_images=1) it creates a special spammy token (EMBED*) thus
increasing the spam probability. Please note that this variable has no affects
on attached or linked (<img src="http://....">) images.

If you really hate all the embedded images you may add the following line to
/etc/postfix/body_checks:

/src\s*\=(3D){0,}\s*["']?cid:/

2007.01.01, SJ

Count invalid junk characters only if it's not an UTF-8 encoded part.
The FSM newsletters tend to contain such a strange character sequence (=E2=80=94).
I don't get it why the word Magazines should be written as "Magazine=E2=80=99s"...

2006.12.22, SJ

Fixed a bug preventing 'make install' to copy the default configuration file.
(Makefile.in)

Added an option (useful for package builders) to install clapf under a different root
directory, eg. 'make install root=/tmp/aaa' installs clapf under /tmp/aaa/usr/local/...
(Makefile.in, configure.in)

2006.12.15, SJ

Fixed the fix_url() function to handle complex URLs properly:
http://www.ajandekkaracsonyra.hu/email.php?page=email&cmd=unsubscribe&email=yy@xxxx.kom
(misc.c)

2006.12.11, SJ

Modified the translate() and fix_url() functions to leave the URLs intact
and to allow us to capture url redirects such as
http://www.google.com/url?q=http://1234567.026.annasiksboredok.com/
(parser.c, misc.c)


2006.12.06, SJ

Skip the "User-Agent:" header field and the 3 letter long names of the months.
(shdr.h)

Clean up the extracted attachments while parsing a message.
(parser.c)

**********************************

0.3.28-rc3:

2006.11.27, SJ

Added Word support via the catdoc package (http://www.45.free.net/~vitus/software/catdoc/).
Tested with 0.94.2.

(bayes.c, cfg.h, cfg.c, example.conf)

2006.11.18, SJ

The "X-MimeOLE:" header line is skipped. Remove it from shdr.h if you want to look for it.
(shdr.h)

2006.11.16, SJ

Added OCR support via the gocr and netpbm packages. You need gocr (http://jocr.sourceforge.net/)
0.41 is known to work, and netpbm (http://netpbm.sourceforge.net/), 10.18.12 should be fine.

2006.11.13, SJ

Modified inject_mail() function to insert a spam prefix to the Subject: line
if it is a spam.
(smtp.c, config.h, cfg.c)

Skip the names of the months. Using any date would bias the affected tokens.
(parser.c, shdr.h)

2006.11.09, SJ

Modified the parser routine to recognise the oriental spam better. The SURBL checks
are performed only if we are still unsure and no enough strange/junk characters to
eliminate the unnecessary DNS lookups.
(parser.c, misc.c, misc.h, bayes.c)

2006.10.28, SJ

Modified the parser routine to include the Content-Type and Content-Transfer-Encoding
element to fight image spam better. Added a new variable (penalize_images). Set it to
1 to enable or 0 (or commented out) to disable.
(parser.c, bayes.c, cfg.c, cfg.h, example.conf)

2006.10.27, SJ

Removed the "#include <ldap.h>" reference from the 12th line of user.c.
It's unnecessary if you want to hold user preferences in a MySQL table.
(user.c)

2006.10.25, SJ

Modified the TRAINING/tokens.sql and util/kcdb_mysql.sh scripts for the easier
Maildir format training.
(TRAINING/tokens.sql, util/kcdb_mysql.sh)


**********************************

0.3.28-rc2:

2006.10.13, SJ

Log token database training action into mysql table. Look at the
mysqltraininglogtable parameter in example.conf.
(cfg.c, cfg.h, cgi.c, cgi.h, traincgi.c, trainlogcgi.c, user.sql, example.conf)

The "Next" and "Last" messages can be seen only if you have more than 'page_len'
messages in your spam quarantine.
(spamcgi.c)

2006.10.10, SJ

Spamdrop works from CDB database even if we have compiled with --enable-hash-db
(bayes.c)

2006.10.09, SJ

Changed the default queue directory to /opt/av/tmp
(example.conf)

Added a check to prevent creating an IP-address entry in the blackhole directory
if there's no valid Received: line.
(blackhole/black.pl)

Added a Hungarian messages.h file. If you want to localise your own language
create a messages.h.xx (where 'xx' is your language) file, then translate and
send it to me - please.
(messages.h.hu)

2006.10.06, SJ

The traincgi utility is able to save the trained messages.
(traincgi.c, cgi.c, cgi.h, cfg.c, cfg.h, example.conf)

2006.10.05, SJ

Added an extra header to specify the reason why clapf decided about a certain
message that it's a spam. Expect more definitions in the future.
(session.c, messages.h)

Fixed the configure script to let the --disable- option to work.
(credits: Anatoly Shipitsin)
(configure.*, Makefile.in)

2006.10.04, SJ

Added a syslog support to the black.pl script.
(blackhole/black.pl)

2006.10.03, SJ

Fixed a bug in the parse() routine. Clapf now checks the second Received line,
ie. the machine contacted you. Of course you may adjust clapf to check the first
machine (ie. in the last Received: line), but please note that the Received
lines can be spoofed, that's why I decided to check the last computer in the
Received: chain (not counting ourself). If you want to hack this feature, look
at parser.c around the 125th line and adjust (or remove it for the last Received
line) the "state.ipcnt < 2" snippet.
(parser.c)

2006.10.02, SJ

Modified the blackhole feature to use a special directory holding
IP-address entries instead a mysql table.
(black.c, black.h, bayes.c, cfg.c, cfg.h, example.conf, blackhole/black.pl, blackhole/black.crtab.sh)

2006.09.29, SJ

Changed the default action to "junk" in passmail.
(passmail.c)

2006.09.28, SJ

Added a paging option to the spam quarantine.
(spamcgi.c, cfg.h, cfg.c)

2006.09.20, SJ

Commented out a disturbing print line in the maillog processing Perl script
(stat/process_syslog.pl)


************************

0.3.28-rc1:

2006.09.06, SJ

The passmail utility uses read() system calls instead of fgets().
(passmail.c)

2006.09.04, SJ

Added a paging option to the spam quarantine listing. Please
recompile the spamcgi and traincgi utilities as they cannot see
the older spam messages because in order for the correct paging
now we use a "timestamp." (YYYYMMDDhhmmss.) before the original name
of the spam file. I recommend you either to remove or backup/deliver
them if you need them any longer.
(spamcgi.c)

Added common header links.
(spamcgi.c, usercgi.c, cfg.h, cfg.c, example.conf)

2006.09.01, SJ

Modified the spamcgi cgi utility to print html links instead
of form submit buttons. I found this a more elegant and nicer
solution.
(spamcgi.c)

The usercgi cgi utility adds a few html tags at the bottom of
the generated html page. It's a cosmetic only fix.
(usercgi.c)

Fixed a looping problem while trying to deliver a message from
the spam quarantine.
(smtp.c)

2006.08.25, SJ

Modified the hash table creation in the hash_db memory table
for a smaller memory footprint.
(hash_db.h, hash_db.c)

2006.08.21, SJ

Binary messages (particularly those containing NUL characters) are
handled properly.
(config.h, smtp.c)

************************

0.3.27:

2006.08.18, SJ

Added a fix in 'make install' to copy the default config file in case
of a fresh install with no previous config file.
(Makefile.in)

2006.08.15, SJ

Modified the mail injection to allow multiple headers to be inserted
in case of spam. I try it with fgets() again with a dirty hack to
handle binary emails.
(config.h, cfg.h, cfg.c, session.c, smtp.c)

(credits: Mariano Reingart)

0.3.27-rc3:

2006.07.17, SJ

Modified the inject_mail() function to able to handle binary messages correctly.
(smtp.c)

Fixed typo in mysql.c. The word 'token' was misspelled as 'tokem' causing error
in database training.
(mysql.c)

2006.07.11, SJ

Modified the configure script to build the cgi utilities if we
have an LDAP user database.
(configure*, Makefile.in)

2006.07.06, SJ

Tested the configuration with Active Directory and added a little docs
on this issue. The name of the LDAP entries can be changed they are defined
in user.h.
(user.c, user.h)

2006.06.30, SJ

Modified clapf to be able to send spam to an smtp server other than our
postfix on 127.0.0.1:10026. It's useful if you want to create a distributed
setup, eg. if you may want to implement a spam quarantine on a dedicated
machine other than the computer where your users have their own mailboxes.
In this case set spam_smtp_addr and spam_smtp_port to point to this box.

Otherwise set spam_smtp_addr and spam_smtp_port to the same values
like postfix_addr and postfix_port respectively (this is the default).
(session.c, cfg.c)


************************

0.3.27-rc2:

2006.06.29, SJ

Created the spamdrop utility to use it with an LDA such as maildrop.

You may use the following code snippet with maildrop to collect spam in the
junk folder:

`/usr/local/bin/spamdrop /usr/local/etc/clapf.conf `
if($RETURNCODE == 1)
{
       to "mail/junk"
}

(Makefile.in, spamdrop.c, doc/man/spamdrop.1)

Removed the hash database conditional stuff from spamtest. It's not useful
to read the tokens into memory while testing a single message.
(test.c)

Modified a decription slightly in clapf.schema
(ldap/clapf.schema)

2006.06.27, SJ

Added an option to let clapf connect to the MySQL server through a unix domain socket
(cfg.h, cfg.c, black.c, user.c, user.h, passmail.c, smtp.c, train.c, traincgi.c, example.conf)

2006.06.23, SJ

Modified the surbl code to check the URLs in the message against the surbl
domain(s) defined only if we are not sure that the email is certainly ham
or spam thus eliminating unecessary (and costly) SURBL lookups.
In other words the surbl code is only used if we are unsure whether the
message is spam or ham.
(bayes.c)

2006.06.19, SJ

Modified the configure script to have a better (and gentoo compatible) mysql check.
You should add the --enable-mysql if you need mysql support.
(configure*, Makefile.in)
(credits: Anatoly Shipitsin)


************************

0.3.27-rc1:


2006.06.12, SJ

Fixed small typo in doc/html/config.html in the esf_h section.
(doc/html/config.html)

2006.06.06, SJ

The findnode() function returns the pointer of the node if possible.
(hash.h, hash.c)

Added a small code to the surbl support to able to mark the message
as spam if it has too many SURBL caught URLs though the spamicity
does not reach the spam limit. See the following parameters:
rude_surbl and spaminess_of_caught_by_surbl
(cfg.h, cfg.c, bayes.c, example.conf)

2006.06.05, SJ

Added a spec file suitable to create an RPM package from clapf.
(clapf.spec)
(credits: Tim Philips)

2006.06.02, SJ

Changed the defaults of both esf_h and esf_s to 1. It needs
more investigation when using non integer degrees of freedom values.
(cfg.c, example.conf, doc/html/config.html)

Added support to use the GNU GSL library if available. It has a better
chi square function implementation as it allows the degrees of freedom
to be a double type. Our internal implementation allows only integer as
the degrees of freedom though current tests show that it may be just fine.
And the chi square function went to a separate file.
(chi.c, chi.h, configure*, hash.c, hash.h)

2006.06.01, SJ

Added three more characters to the invalid_junk_characters array
(ijc.h)

************************


0.3.26:

2006.05.23, SJ

Checking for content-type-encoding was case sensitive. Fixed.
(parser.c)

2006.05.18, SJ

Added the 'X-Mailer' stuff to the skip_headers[] list.
Remove it if it bothers you.
(shdr.h)

Fixed 'Subject' handling.
(parser.c)

2006.05.16, SJ

Fixed a minor bug in perl/shrink.pl to handle URLs properly
(perl/shrink.pl)

*******************************

0.3.26-rc3:

2006.05.10, SJ

Fixed the install entry in Makefile: do not try to install the passmail
utility if it is not compiled.
(credits: Pintr Tams)
(configure*, Makefile.in)

Fixed a minor bug in the configure script. When selecting clamd
the configure script has said "You have not selected any antivirus support.
clapf will not protect you from hostile code coming in e-mail", though
it did configured clamd antivirus support.

(credits: Pintr Tams)
(configure*)

2006.05.09, SJ

Modified the count_invalid_junk() function to replace junk characters
with JUNK_REPLACEMENT_CHAR
(misc.c)

2006.05.04, SJ

Virus quarantine directory is undefined from now. Let clapf
not collect infected files by default. Specify the quarantine_dir
variable in clapf.conf if you want to maintain a virus quarantine
directory.
(cfg.c, config.h)

Moved passmail in Makefile.in
(Makefile.in, configure*)

Antivirus documentation merged into av.html
(doc/html/av.html)

2006.05.03, SJ

Lower case is the default for now. If you want clapf to be case sensitive,
please use the --enable-case configure option. The --enable-lower option
is withdrawn. I recommend you to recreate the token database (CDB file).
(configure.*, misc.c, bayes.c)

If a token in the Subject: line is not found try it without the 'Subject*' prefix.
(bayes.c)

The SURBL* special token is only included in the token list if it the domain is
found in the SURBL database.
(bayes.c)

2006.04.27, SJ

Fixed the max_junk_spamicity declaration in cfg.h
(cfg.h)

2006.04.26, SJ

Passmail is able to move the spam into the quarantine or silently discard.
The action is stored either in mysql database or ldap directory.
(passmail.c, user.c, user.sql, ldap/clapf.schema)

Created the usercgi cgi utility to let users set their own preference.
(usercgi.c, user.c, user.h)

2006.04.25, SJ

Modified the traincgi utility a little.
(traincgi.c)

Tried to lowercase everything. Use the --enable-lower configure option.
I recommend you to recreate your token database (tokens.cdb)
(misc.c)

******************


0.3.26-rc2:

2006.04.19, SJ

Changed MAX_RCPT_TO 16->128. Adjust this variable to your need.
(config.h)

2006.04.11, SJ

Spam quarantine has been redesigned. Now each user has his own spam
quarantine and can manage it. You may store the users' email addresses
either in LDAP or Mysql database.
(passmail.c, session.c, spamcgi.c, smtp.c, smtp.h, cfg.c, cfg.h)

******************

0.3.26-rc1:

2006.04.07, SJ

Clapf now honors user preferences. For now the idea is very simple.
We have a CDB file containing emails and username mapping as well
as username and preferences mapping.
(cfg.c, cfg.h, perl/userpref.pl, session.c, user.c, misc.c)

2006.04.06, SJ

Modified the SURBL code to allow to query multiply URI blacklist domains.
If you want to do so build a comma separated list of URI blacklist domains.
(bayes.c)

Buffer overflow bug has fixed in the parser routine.
(parser.c)

2006.04.05, SJ

Added SURBL support. It is a kind of RBL list. It can be used to check domains in the message
against a central database. If the domain exists in the database a special token is created
SURBL*domainname and assigned 0.9999 as its spamicity.
(bayes.c, cfg.c, cfg.h, configure*, example.conf)

Fixed the fix_url() function :-)
(misc.c)

0.3.25:

2006.04.03, SJ

Cleaned under perl/ and util/ directories

**************************

0.3.25-rc4:

2006.03.31, SJ

Fixed minor typo in stat/clapf-rrd-graph-oneline.sh
(stat/clapf-rrd-graph-oneline.sh)

2006.03.29, SJ

Revised the HTML documentation.

2006.03.28, SJ

MIN_WORD_LEN raised 2->3 and MAX_WORD_LEN changed 24->18
If you get worse results feel free to change these values.
(config.h, perl/shrink.pl)

2006.03.22, SJ

Try to auto utf8- and quoted-printable decode lines. It should not hurt.
(parser.c)

Removed the question mark(?) from the translation table.
(trans.h)

2006.03.21, SJ

Modified the bayes_file() routine to let a junk email come in if its
spamicity score is under a certain limit (=max_junk_spamicity).
(bayes.c, cfg.c, cfg.h, example.conf)

Removed the '' sign from the invalid_junk_characters list.
(ijc.h)

2006.03.20, SJ

Created ptable.h to hold a list words you don't want to train the database
with them. Also modified the perl/shrink.pl file according to this.
(train.c, traincgi.c, per/shrink.pl)

2006.03.17, SJ

Modified the configure.* scripts to build the splitmbox utility if the
antispam support is disabled.
(configure.in)

2006.03.16, SJ

Skip lines having an RFC822 formatted date (eg. "Tue, 14 Mar 2006 15:46:03 ")
to eliminate the impact of dates.
(parser.c)

Decremented the min_phrase_number from 40 to 30. (config.h, example.conf)

2006.03.14, SJ

Tried the following: create 2 hash tables: one holding only the single
tokens and the other only the phrases (token pairs). URLs are in both
tables. Now let's calculate the spamicity from both tables then choose
the one with greater deviation. Gave up using triplets any longer.
Testing in progress.
(perl/shrink.pl, bayes.c)

********************

0.3.25-rc3:

2006.03.13, SJ

Added a utility called splitmbox to split an mbox file to separate messages.
(splitmbox.c, Makefile.in, doc/man/splitmbox.1)

Ingore the "Date: " lines when parsing a message.
(perl/shrink.pl, shdr.h)

Changed the number of most interesting words from 20 to 15, because I use
token pairs only.
(config.h)

Got a mail having a boundary definition in the mail header but containing no
boundary line in the body. So I included a check for this.
(parser.c)

Store the URLs in the form of URL*domain.com, where domain.com is shortened
to fit in MAX+WORD_LEN size.
(misc.c, misc.h, bayes.c)

********************


0.3.25-rc2:

2006.03.09, SJ

Revised reloading clamav antivirus database. Some error happened
after the reload. Revised reload_clamav_db() function and removed
clam.c and clam.h. And added clamd support.
(clamd.c, clamd.h, cfg.c, cfg.h, example.conf, doc/html/config.html)

2006.03.07, SJ

Changed the time values from [usec] to [ms].
(bayes.c, avast.c, avg.c, drweb.c, kav.c, session.c)

Introduced a new storage device called hash_db. It reads the raw
token file into a hash (called t_hash). I expect a better speed.
Please note that it takes a few MBs extra memory per processes.
You may enable this feature by using a configure option called
--enable-hash-db
(configure.in, Makefile.in, hash_db.c, hash_db.h, cfg.h, cfg.c, example.conf)

Modified the mysql2cdb.pl script to create the raw token data file as well.
(perl/mysql2cdb.pl)

Changed the default concurrancy from postfix to clapf from 20 to 10.
(doc/html/install.html)

Introduced the kill_child() function to prevent a child to run for a long time.
You may use the session_timeout variable to set this value.
(cfg.c, cfg.h, session.c, example.conf, doc/html/config.hml)

2006.03.06, SJ

Tried a new method: use only the token pairs and single URIs. This
results a much smaller CDB database (cca. 1/3)
(perl/shrink.pl, util/kcdb_mysql.sh, util/mysql2cdb.sh, bayes.c)

Added a new variable (tv_sent) to track how much time clapf needs
to inject a single message back to clapf. Time values are printed
in msec instead of usec.
(session.c)

Fixed a bug around message injection if the spam_quarantine variable is set.
(session.c)

2006.03.02, SJ

Introduced two new hash to hold exclusively token pairs and triplets
respectively. If we have enough known triplets use only them in the
spamicity calculation. If we have not then try the same with the
pairs and even with the phrase hash. We may use the all token holding
hash as a last resort. This is because overlapping tokens contain
such a redundancy difficult to handle.
(bayes.c)

Simplified the boundary handling
(parser.c, parser.h)

Try to utf-8 decode everything in the buffer. It should not hurt.
(parser.c)

2006.03.01, SJ

Revised qouted-printable soft break handling
(parser.c)

Improved utf-8 decoding
(decoder.c)

Removed the paragraph sign from ijc.h
(ijc.h)

2006.02.28, SJ

Added Dr.Web antivirus support.
(cfg.c, cfg.h, drweb.c, drweb.h, example.conf, doc/html/config.html)

2006.02.27, SJ

Nesty bug was found. MUAs put a few lines at the beginning of the message.
When you test a single message with spamtest or you make the spamicity
database these lines are used. But clapf cannot see these lines when it
gets the message thus the calculated result might be different. I modified
the parser routine to discared these lines:
"Return-Path: ", "X-Original-To: "
and the first line if it starts with "From "
(parser.c, parser.h)

Fixed a bug in util/kcdb_mysql.sh
(util/kcdb_mysql.sh)

*********

2006.02.24, SJ

Intoduced a new hash containing only phrases (token pairs and triplets).
If we have more than min_phrase_number phrases, do not care about single
tokens.
(cfg.h, cfg.c, bayes.c, example.conf)

Raised MAX_SAMPLES_TO_CHOOSE from 15 to 20 because we have more tokens
by using phrases too.
(config.h)

Introduced the 'effective size factor' (ESF) variable to help against the
annoying redundancy. ys and yh are both set to 0.2 for now to test it.
If you want to disable the ESF set both esf_h and esf_s to 1.
(hash.c)

Released 0.3.24.

2006.02.23, SJ

Added the inverse chi-square algorithm
(hash.c)

Robinson probability is the default again
(util/kcdb.sh, util/kcdb_mysql.sh, util/mysql2cdb.sh)

2006.02.22, SJ

Removed left hard coded WORK_DIR from session.c
(session.c)
(credits: Anatoly Shipitsin)

Removed test (and annoying) stuff from the hash function
(hash.c)

2006.02.21, SJ

Thunderbird starts with a "From - Thu Sep 15 11:27:01 2005"-like line
(parsembox.c)

If clapf cannot read the specified file it returns ERR_BAYES_* and
clapf will syslog the event with its 'error code'.
(bayes.c, config.h)

Added Kaspersky (kav) support
(session.c, kav.c, kav.h, cfg.c, cfg.h, config.h, example.conf, doc/html/kav.html)

2006.02.20, SJ

Introduced the clamav html documentation file
(doc/html/clamav.html)

Replaced the configuration file options with a reference link in the
manual page.
(doc/man/clapf.8)

Replaced the relocate_timeout with relocate_delay
(cfg.h, cfg.c, traincgi.c, example.conf)

Created a configuration file directive list
(doc/html/config.html)

Modified the parser routine to skip specific message headers listed
in the shdr.h. Unfortunately my mail already contains some headers
I don't trust. This of course distorted the result clapf calculated.
(parser.c, shdr.h)

2006.02.17, SJ

Modified the parser routine to include all the header information
and to skip all the numeric only tokens. This increased the size
of my spamicity database with 3%.
(parser.c, bayes.c)

Minor fix in the util/kcdb_mysql.sh script
(util/kcdb_mysql.sh)

2006.02.16, SJ

Removed the use_quarantine configuration file option.
It is enough to check the quarantine_dir variable
(cfg.c, cfg.h, session.c, example.conf)

2006.02.15, SJ

Fixed bug in the translate function.
(misc.c)

Added a new configuration file option (exclude_unknown_tokens) to
exclude unknown tokens from the final calculation. If you have only
a few known token in the database and a lot of unknown token you
may experience higher false positives as well as false negatives
rate but you may catch such a spam which includes a single url or
image with some unknown tokens.
I recommend you to turn off this feature if it causes more trouble
than good.
(cfg.c, cfg.h, hash.c, bayes.c, example.conf)

2006.02.11, SJ

Reimplemented connection handling according to D.J. Berstein's
tcpserver. Postfix to clapf SMTP communication code went to session.*
(clapf.c, session.c, session.h, sig.c, sig.h)

2006.02.10, SJ

Moved cgi.c functionality to traincgi.c which compiles into traincgi
(cfg.c, traincgi.c, spamcgi.c, cfg.c cfg.h, example.conf, doc/man/clapf.8)

Simplified the 'train' utility
(train.c)

Redesigned configure script. It prints a summary of selected features.
--enable-clamav is renamed to --enable-libclamav
(configure*)

A very strange and nesty thing has happened. Suddenly all the children
one after one became zombies. I decided to disable the connection limiting
code until this issue would be resolved and let postfix to decide how many
connections to open for us.
(clapf.c)

2006.02.09, SJ

Modified debugging not show the spamicity of all the tokens just
the most interesting ones and print lines of the message being parsed
(hash.c, parser.c)

Added an option to determine what tokens to include in spamicity
calculation. The more use you use the more accurate result you get
at the cost of more queries against the database. By default all
tokens are used.
(cfg.c, cfg.h, bayes.c, example.conf)

No antivirus product selected by default, you should choose one (or more)
with the appropriate configure option.
(configure*)

Removed the spam training recommendation feature. I'm not sure that
it's useful.
(cfg.c, cfg.h, clapf.c)

MIME decoder was changed to understand one level nested attachments
(mime.c)

The spam quarantine cgi utility was changed to use spam_quarantine_dir
configuration file variable and it is able to deliver the message to
the recipient if the system administrator wants to do so.
(spamcgi.c)


2006.02.08, SJ

Cleaned up some code around antivirus handling
(clapf.c)

Cosmetic modification with AVG
(avg.c)

Modified the parser routine to handle base64 or quoted-printable encoded
Subject lines. Note I don't care about multiple encoded stuff eg.
Subject: =?iso-8859-2?Q?T?= =?iso-8859-2?Q?E?= =?iso-8859-2?Q?S?= =?iso-8859-2?Q?T?=
(parser.c)

2006.02.07, SJ

Modified the boundary handling to accept nested boundaries (one level)
(parser.c)

Added avast! support
(cfg.c, cfg.h, example.conf, clapf.c, avast.c, avast.h)

Modified the syslog processing script to handle the days 1-9 correctly
(stat/process_stat.pl)

Released 0.3.24-rc1

2006.02.06, SJ

Added smtp response if the message was moved into the spam quarantine directory
(clapf.c)

2006.02.02, SJ

Developed a small code to catch even more Chinese, Japanese, Korean, ... spam
(cfg.c, cfg.h. bayes.c, parser.c, parser.h, example.conf)

2006.01.30, SJ

The invalid_junk_limit and invalid_hex_junk_limit features can be disabled
by setting them to 0.
(cfg.c, bayes.c)

Connection handling was slightly modified
(clapf.c)

2006.01.27, SJ

Removed the openssl version of base64 decoder implementation
(decoder.c, configure*)

Fixed base64 decoder to handle binary files properly
(decoder.c, decoder.h)

Implemented AVG anti virus support
(mime.c, mime.h, avg.c, avg.h, cfg.c, cfg.h)

Released 0.3.23-rc3

2006.01.26, SJ

Minor glitch was fixed when moving a message to quarantine has failed
(clapf.c)

Removed hapaxes (tokens occurring only once) from the token dictionary
(perl/create*.pl, perl/mysql2*pl)

If a token is not found, convert it to upper than lower case and try again
(bayes.c)

Released 0.3.23-rc2.

2006.01.25, SJ

Removed the spam_quarantine variable, the spam_quarantine_dir variable
is sufficient alone.
(example.conf, clapf.c, doc/man/clapf.8)

2006.01.24, SJ

Spam quarantine cgi utility has been created
(Makefile.in, spamcgi.c, cfg.c, cfg.h, example.conf)

2006.01.23, SJ

Spamicity value is printed with %.4f
(clapf.c)

CDB handling was rewritten to use only a single cdb database
Forced database creation from mysql database can be done
just add a 3rd argument as "x" to util/mysql2cdb.sh.

eg: sh util/mysql2cdb.sh tokens my.cnf x

(bayes.c, cfg.c, cfg.h, perl/mysql2markov.pl, util/mysql2cdb.sh)

2006.01.19, SJ

Delivery information is written to the quarantine
directory when a file is put there. It may be used
to determine whose messages it is or to have it
delivered back to postfix.
(misc.c, misc.h, clapf.c)

2006.01.18, SJ

Man pages went to doc/man
(clapf.8)

1 was renamed to spamtest
(Makefile.in)

More man pages are created
(train.1, spamtest.1, parsembox.1)

Much of the documentation went to the manual pages and to the html (online) pages
(doc/).

2006.01.16, SJ

The backlog parameter in the listen() system call can be set
in the configuration file
(cfg.c, cfg.h, clapf.c, example.conf)

Released 0.3.22

2006.01.13, SJ

Modified the new connection handling code
(clapf.c, errmsg.h, example.conf)

Terminate connection immediately if the client has issued the QUIT
command and we have answered with 221.
(clapf.c)

2006.01.12, SJ

max. number of connections (eg. the listen() system call)
can be set in the configuration file

2006.01.11, SJ

clamav archive variables can be set in the config file
(cfg.c, cfg.h, clapf.c, example.conf)

Created an initial manual page
(clapf.8, Makefile.in)

Introduced move_message_to_quarantine() function to move files
to a quarantine directory
(cfg.c, cfg.h, clapf.c, misc.c, misc.h, example.conf)

Update the timestamp of existing records in the blackhole table
(blackhole/black.pl)

2006.01.10, SJ

Fixed minor bug in translate() function
(misc.c)

Fixed minor configure error
(configure*)

2006.01.09, SJ

Removed 1.conf from the source tree.

Introduced a command line trainig utility
(train.c, config.h, errmsg.h, Makefile.in, doc/TRAINING)

Introduced logging levels
(config.h, clapf.c, bayes.c, stmp.c)

2006.01.05, SJ

Added a fix to handle base64 decoded parts better. If a base64 encoded line
does not end with a line break (in the decoded text). It will save the
last and incomplete part.
(parser.c, parser.h)

Added a similar fix to handle soft breaks in quoted-printable parts for better
parsing.
(parser.c, parser.h, misc.c)

2006.01.04, SJ

Introduced an alternate base64 decoder function without the openssl crypto library
and this is the default from now. Use the "--enable-openssl" configure option to
stick with the openssl implementation.
(decoder.c, configure*)

2006.01.03, SJ

Added more detailed debug
(clapf.c, smtp.c, doc/SYSLOG)

Changed MAXCONN: 15 -> 20
(config.h)

Added a check for every free() call.
(bayes.c, hash.c)

Released 0.3.22-rc2

Fixed license / copyright docs and removed doc/COPYRIGHT
(doc/LICENSE)

2006.01.02, SJ

Spam statistics creation is possible with an external tools (rrdtool)
(stat/clapf-rrd-*sh, doc/STATISTICS)

Added an option to mark base64 encoded textual messages as spam
(parser.c, parser.h, bayes.c, 1.conf, example.conf)

2005.12.30, SJ

Added dist_clean option to Makefile(.am) which removes
Makefile, config.status and config.log

2005.12.29, SJ

Fixed smtp stuff around the RSET command
(clapf.c)

Fixed documentation
(doc/README)

2005.12.25, SJ

buffer oveflow bug was fixed
(parser.c)

Code cleanup
(bayes.c, config.h, hash.c, hash.h, misc.c, misc.h, 1.conf, example.conf)

Released 0.3.21

2005.12.23, SJ

You may choose to save the trapped emails to a file
(blackhole/black.pl)

Fixed parser routine. It will check only IPv4 addresses in the blackhole code
(parser.c)

2005.12.22, SJ

Fixed IP-address extraction in blackhole/black.pl
(blackhole/black.pl)

2005.12.21, SJ

Few config.h parameters can be set from configuration file
(cfg.c, cfg.h, 1.conf, example.conf)

Released 0.3.21

2005.12.19, SJ

Added a check for free() in free_and_print_list()
(parser.c)

Introduced an alternative spamicity calculation method.
To prevent a gigantic database skip tokens occuring only once
in either in the ham or in the spam corpus.

(perl/create_markov_cdb.pl, util/kcdb.sh)

2005.12.13, SJ

Modified boundary search algorithm
(parser.c)

2005.12.12, SJ

Merged the inject_mail() and send_notify() functions
(smtp.c, smtp.h, clapf.c)

Released 0.3.21-rc3
(config.h)

Changed blackhole handling
(blackhole/black.pl)

2005.12.10, SJ

Cleaning code in inject_mail()
(smtp.c)

Adding debugging
(clapf.c, smtp.c)

2005.12.09, SJ

Removed heuristic tests from clapf. Perhaps I will include
some of them if they needed by the clapf users.
(cfg.c, cfg.h, 1.conf, example.conf)

Parsing the "Received: from" lines went to parser.c
(parser.c, parser.h)

Renamed the inoculation feature to blackhole
(bayes.c, configure, configure.in)

Redesigned the blackhole feature with MySQL instead of SQLite3
(bayes.c, black.c, black.h)

Cleaned some smtp injecting code (smtp.c)

2005.12.08, SJ

Introduced a new configuration file parameter (max_message_size_to_filter)
to skip spam test if the size of the message is greater than this value.
0 is the default and it means no such a limit
(clapf.c, cfg.c, cfg.h, 1.conf, example.conf)

Fixed a buffer size problem in the parser module
(parser.c, parser.h, config.h)

Token database can be teach via a cgi program called "cgi"
It needs mysql database
(cgi.c)

2005.12.07, SJ

Introduced two new characters into the invalid_junk_characters[] buffer
(ijc.h)

Introduced a new configuration option (use_all_the_most_interesting_tokens)
to enable to use all the most interesting tokens in the spamicity calculation
(cfg.c, cfg.h, hash.c, 1.conf, example.conf)

Removed message parsing from clapf.c and use bayes_file() as in
parsembox.c
(clapf.c)

Fixed a minor bug in the chained list handling
(parser.c)

2005.12.06, SJ

Integrated the new parser routine
(clapf.c, bayes.c, parsembox.c)

Removed heuristic support for now except one: if you have
too many spammy token in the top15 and have_too_spammy_top10 is
enabled it returns 0.998877 and marks the message as spam.
Use the spam_ratio_in_top10 variable. It is calculated as spammy tokens / all tokens
(cfg.c, cfg.h, bayes.c, hash.c)

2005.12.05, SJ

Introduced a message parser routine to avoid implementing it multiple times
(parser.c, parser.h)

2005.12.01, SJ

Introduced new function split() to parse message lines
(misc.c, misc.h, bayes.c)

2005.11.30, SJ

Fixed token pair and triplet creation
(perl/shrink.pl)

2005.11.29, SJ

Some of the clamav stuff went to clam.[ch]
(clapf.c, clam.c, clam.h)

2005.11.28, SJ

perl/askdbm.pl was removed

Splitted the token database into 3 files (tokens*.cdb)
(cfg.c, cfg.h, config.h, bayes.c, util/kcdb.sh, perl/createcdb.pl, Makefile.in)

clamav antivirus support is optional from now. Use the
--disable-clamav configure option to disable it
(configure, configure.in, clapf.c)


2005.11.24, SJ

Introduced the findnode() function to prevent querying the token database
multiply time for a single token
(hash.c, hash.h, bayes.c)

2005.11.23, SJ

Modified to handle text-only message parts where there is no Content-Type field
after the boundary
(bayes.c, parsembox.c)

2005.11.22, SJ

pcre stuff was rewritten, so you do not need pcre any more
(parsembox.c, bayes.c, config.h)

2005.11.21, SJ

The configure option "--enable-blackhole" was renamed to "--enable-inoculation"
(configure.in, configure, doc/README)

2005.11.18, SJ

Added autoconf style configuration
(Makefile.in. aclocal.m4, install-sh, configure, configure.in)

Released new version: 0.3.20

2005.11.17, SJ

unescape(), base64_decode(), utf8_decode and qp_decode went to decoder.c
and unescape was renamed to url_decode()
(misc.c, misc.h, decoder.c, decoder.h, bayes.c, parsembox.c, Makefile, Makefile.blackhole)

2005.11.16, SJ

Fixed calculating the overall spamicity if we have
tokens with 0.4 spamicity in the top15
(hash.c)

Introducing new heuristic check: counting the length of too long
tokens compared to the lenght of the body
(bayes.c, cfg.c, cfg.h, 1.conf, example.conf)

Mailbox parsing revised
(parsembox.c)

Adding blackhole functionality
(black.c, black.h, bayes.c, Makefile.blackhole, ino/*)

2005.11.15, SJ

Fixed Content-Type handling
(bayes.c)

If there's no valid token return DEFAULT_SPAMICITY
(hash.c)

Simpilfied the cdb interface
(bayes.c, clapf.c)

2005.11.14, SJ

Fixed Content-Type handling
(bayes.c)

2005.11.11, SJ

Count invalid junk stuff only where we analysis data
(bayes.c)

2005.11.10, SJ

Renamed bayes_init(), bayes_close() and ERR_BAYES_INIT
(bayes.c, bayes.h, errmsg.h, clapf.c, test.c)

Removed parsembox from the"make av_only" section
(Makefile)

Documentation stuff, copyright and license info went to doc/
(README, TRAINING, COPYRIGHT, LICENSE)

Introduced new header field to keep track of message IDs
(clapf.c)

Minor bug fixed in parsing mbox files
(parsembox.c)
 
2005.11.09, SJ

Give token pairs a greater distance from the neutral 0.5
and do not include all the interesting tokens only the 15
most interesting one.
(perl/createcdb.pl, hash.c, clapf.c, test.c)

Introduced token triplets for even greater accuracy
(bayes.c, perl/shrink.pl, perl/createcdb.pl)

Introduced a new heuristic check; ratio of spaces to the whole body
(cfg.c, cfg.h, bayes.c, 1.conf. example.conf)

Simplified and speed up the creation of the CDB database, DB part is removed
(util/kcdb.sh, perl/createcdb.pl)

2005.11.08, SJ

Introduced qp_decode() function
(misc.c, misc.h, bayes.c)

2005.11.05, SJ

Introduced utf8_decode() function
(misc.c, misc.h, bayes.c)

Min. token length changed from 3 to 2
(config.h, perl/shrink.pl)

,  added to translation table
(trans.h)

2005.11.04, SJ

Introduced new function count_invalid_hexa_stuff to count
the invalid hexa encoded junk characters
(misc.c, misc.h, bayes.c)

2005.11.02, SJ

If the message is surely spam/ham according to the Bayesian
calculation, return with the result immediately.
(bayes.c)

Skip tokens in the top15 if they are unknown in the spamicity database.
(hash.c)

2005.10.28, SJ

Fixed URL counting and true ham/spam ratio
(bayes.c)

REAL_SPAM_TOKEN_PROBABILITY variable restored to 0.9999
and introduced new variable: MOST_INTERESTING_DEVIATION 0.4998
(config.h)

Post sorting has been removed, since I think false positives are worse
than false negatives and I include all the most interesting tokens
calculating the overall spamicity so I do NOT give precedence for
the real-spam tokens over the real-ham tokens.
(hash.c)

2005.10.27, SJ

REAL_SPAM_TOKEN_PROBABILITY variable changed
(config.h)

Do not count spam-only tokens to the suspicious tokens.
They have precedence over the ham-only tokens.
(bayes.c)

2005.10.26, SJ

Minor code cleanup
(bayes.c)

2005.10.21, SJ

Few cleanups and a minor fix in counting short tokens
(clapf.c, bayes.c)

2005.10.17, SJ

Fixed context creation
(bayes.c)

Real spam tokens have precedence over real ham tokens in the top15
Do not care about ham/spam token ratio in the top15
(hash.c, cfg.h, cfg.c, 1.conf, example.conf)

2005.10.13, SJ

Few cleanups
(bayes.c, config.h)

2005.10.11, SJ

New option added to configuration file called trueham_truespam_ratio
to handle such a situation where we have only 0.0001 values in the
top 15 but the whole message has a lot of spammy tokens, much more
than ham only tokens.

(cfg.c, cfg.c, bayes.c, example.conf, 1.conf)

2005.10.10, SJ

Created contexts (consecutive tokens) in the spam database
(bayes.c, perl/shrink.pl)

Removed spam-only tokens occuring only once as they are probably noise.
(perl/createcdb.pl)

2005.10.06, SJ

too_many_html_tags was changed from the very permissive 0.4 to 0.22
(cfg.c, example.conf)

minor changes in bayes.c

2005.10.04, SJ

SMTP codes went to smtpcodes.h

Postmaster notification is possible now when a virus infected email arrives
It is disabled by default set the localpostmaster and clapfemail in the
configuration file.


(smtp.c, smtp.h, clapf.c)

2005.10.03, SJ

Removed the -DBE_RUDE option from Makefile
(Makefile)

2005.09.30, SJ

Released new version: 0.3.19

Variables can be set in a configuration file.
Constants went to several .h files from config.h.
heuristic.h was removed default values are now in cfg.c


2005.09.27, SJ

Introduced a new variable SPAM_OVERALL_LIMIT to determine
whether the whole message is spam or not.

I got some emails containing such a content that the bayesian
module returned a spamicity value of 0.94xx which was treated
as not spam though it was. Since this value was really near
the spam limit I decided to lower the overall spam limit
a little to handle these cases too. Of course you may set
SPAM_LIMT and SPAM_OVERALL_LIMIT to the same value.

(config.h, clapf.c)

Introduced a new heuristic check: if the From: and To: address
is the same (bayes.c, heuritic.h, HEURISTIC)

Introduced another heuristic check testing the ratio of HTML
tokens to all the tokens (misc.h, misc.c, bayes.c, heuristic.h)

2005.09.12, SJ

Added more heuristic check on the From: header line, testing
if the address starts/ends with a number and the line contains
a real name between ""
(bayes.c, heuristic.h)

2005.09.08, SJ

Added a new heuristic check: testing the subject for all capitals
(bayes.c)

2005.09.05, SJ

More characters added to the invalid_junk_characters[] buffer
(ijc.h)

2005.08.29, SJ

If the Content-type is text/plain and the body is base64 encoded
give some extra score
(heuristic.h, bayes.c)

2005.08.23, SJ

Created header file for heuristic related definitions.
(heuristic.h)

Modified spamicity calculation.
(hash.h, hash.c, bayes.c, config.h)

Released 0.3.18
(config.h)

2005.08.19, SJ

Extending and fixing the documentation 
(README)
(credits: Dieter Rethmeyer)

2005.08.18, SJ

Adding an additional heuristic check to see the ratio of the length of
the urls in the body to the length of the whole body. This is added
because spammers send a short letter including a long url pointing
to the spam message including some innocent looking text
(bayes.c, config.h, misc.c, misc.h)

2005.07.27, SJ

Additional fix handling Content-Type and Content-Transfer-Encoding
(bayes.c)

2005.07.26, SJ

Fixed handling Content-Type and Content-Transfer-Encoding parts
(bayes.c)

2005.05.18, SJ

make_md5_stuff() function replaced with make_rnd_string() which copies the random
bytes directly into the buffer avoiding the use of the MD5 hash algorithm.
I did this change because the make_md5_stuff() function started to produce only
one kind of queue ID: 00000000000000000000000000000000
(misc.c, misc.h, clapf.c, config.h)

2005.04.21, SJ

modified the assign_spaminess() function to mark tokens as spammy if their spaminess
is above SPAM_LIMIT instead of SPAMINESS_OF_TRUE_SPAM
(bayes.c)

2005.04.19, SJ

modified the unescape function to skip not escaped strings
(misc.c)

2005.04.01, SJ

strcasestr() was renamed to str_case_str() since FreeBSD 5.3 has this function
(misc.c, misc.h)
(credits: Joakim Ryden)

modified Makefile to make antivirus-only install easier
(README, Makefile)

modified CLAPFUSAGE definition to reflect whether one may not want to use the antispam module
(config.h)

fixed a minor FreeBSD 5.x incompatibility
(misc.h)

2005.03.25, SJ

Do the following check only at text parts
(bayes.c, parsembox.c)

2005.03.23, SJ

Base64 decode tricky attachments
(misc.c, misc.h, bayes.c, parsembox.c)

2005.03.21, SJ

An extra check performed to counter the number of spam only tokens.
If their number exceeds a certain limit, mark the message as spam.
(bayes.c, config.h)

2005.02.28, SJ

Added an option (-Q) to move infected files to quarantine
(clapf.c, config.h, README)

2005.01.31, SJ

Added a new function unescape() to decode %xx or =xx sequences in buffers
(misc.c, misc.h, parsembox.c, bayes.c)

Added the pre_translate() function before translate()
(bayes.c)

2005.01.25, SJ

Added a new switch to clapf (-V) to check its verison
(clapf.c, config.h)

2005.01.19, SJ

introduced the strcasestr() function
(misc.c, misc.h, bayes.c, parsembox.c)

2005.01.18, SJ

Modified email/mbox parsing with a naive checking for the "Content-Type:" and better url handling
(parsembox.c, bayes.c, misc.c, misc.h)

Fixed overflow possibility in sorthash()
(hash.c)

2005.01.17, SJ

Increased MAX_HASH_STR_LEN to be able to hold URLs
(hash.h)

A collision capable bug was fixed
(bayes.c)

Fixed line parsing
(parsembox.c, bayes.c)

Modified documentation
(README)

2005.01.11, SJ

Added a few syslog() calls to inject_mail() for debug
(misc.c)

Added a new SMTP 421 answer
(config.h)

Some initialisation moved after fork() to give a chance to close
SMTP communication in a correct manner
(clapf.c)

2005.01.06, SJ

Changed TMPDIR
(README, util/check_clamav.sh)

cdbq.c has been removed from the source tree
(Makefile)

Removed the findnode function. Unused.
(hash.c, hash.h)

Using inithash
(bayes.c)

Modified error messages for the easier tracking
(config.h)

2005.01.05, SJ

Halt if the cl_statchkdir() function fails.
(clapf.c)

Send ALRM signal only if the database has been successfully updated
(util/upgrade_clamav.sh)

2004.12.22, SJ

Added a new variable for invalid variables in hexa-form
(bayes.c, config.h)

Added a socket option and a check around the cl_statinidir() function

(clapf.c)

2004.12.21, SJ

Modified buffer parsing containing an url. invalid_junk_characters[] array
moved to ijc.h
(misc.c, bayes.h, ijc.h)

Added a check to see if we have the invalid characters in a hexa form (eg. =FF)
(bayes.c)

2004.12.17, SJ

Allow urls as tokens
(bayes.c, parsembox.c)

2004.12.13, SJ

Added bayes_close() to the clean_exit() function
(clapf.c)

2004.12.10, SJ

Handle automatic av database updates without a restart
(clapf.c, util/upgrade_clamav.sh)
(credits: Tomasz Kojm)

Handle cdb modifications without a restart
(bayes.c, misc.h)

2004.12.01, SJ

Added a custom array - called invalid_junk_characters - to list
some characters being invalid in your codepage. If their number
exceeds a certain limit, the message is marked as spam with 0.9876
(bayes.h, misc.h, misc.c, bayes.c)

2004.11.30, SJ

Treat special words (occuring only either in ham or spam) a
different probability 0.0001 and 0.9999 respectively
(perl/createcdb.pl)

preserve the hostname part of urls (misc.c)

print single words being alone in a line (parsembox.c)

2004.11.26, SJ

Modified helper perl scripts (shrink.pl, createdbm.pl, createcdb.pl)
Modified the email parsing (misc.c, parsembox.c, bayes.c)
Added hash table support to avoid dups (hash.c, hash.h)

2004.11.25, SJ

Moved the bayesian decision implementation to a separate file (bayes.c)
Modified Makefile

2004.11.16, SJ

Avoid base64 encoded stuff where length is 77 and we have no space inside
(parsembox.c, clapf.c)

Clasp.c was merged into clapf.c

2004.11.12, SJ

Added clasp and other perl utilities to do bayesian filtering.

2004.10.14, SJ

Intallation section of the documentation was modified regarding configuring postfix
to enable virtual aliasing (README)
(credits: Ralf Hildebrandt and Magnus Back)

main.cf:

content_filter = smtp:[127.0.0.1]:10025

master.cf:

127.0.0.1:10026     inet  n      -      n      -      10      smtpd -o content_filter=
                                        -o receive_override_options=no_address_mappings

2004.09.27, SJ

Remove the tmpfile if we run into a 421 error (clapf.c)

2004.09.23, SJ

SMTP_CMD_PERIOD command is checked by a binary safe function
which tolerates \x00 in the last2buf buffer (clapf.c, misc.c, misc.h)

2004.09.08, SJ

Modified not to care with defer(red) queue management. If it cannot
inject the scanned email back clapf returns a '451 temporary error'
or '550 rejected' if postfix responded with 550 (clapf.c).

2004.09.03, SJ

Added some check in the SMTP communication injecting email back (misc.c)

2004.08.30, SJ

Cleaned up around the PERIOD command handling (clapf.c)

2004.08.24, SJ

Added a new buffer (last2buf) to hold the last two packet after the DATA command (clapf.c)

Send a 421 error message back if clapf runs into timeout (clapf.c)

2004.08.23, SJ

Modified Makefile and changed the order of <#include>'s (config.h)

2004.08.02, SJ

Changed the SMTP_CMD_PERIOD definition (config.h)

#define SMTP_CMD_PERIOD "\x0d\x0a\x2e\x0d\x0a"
