We store all bayesian and whitelist data for Spamassassin in a PostgreSQL. Keeping it all in a database like this allows all members of our email cluster to access the same data.Bayesian processing works by noticing lots of little unique snippets - tokens - and storing them for future reference.  If lots of tokens were found in a message flagged as spam then future messages containing these tokens are more likely to be spam as well. This has the same weighing effect for non-spam messages. Over time, watching these tokens increases the quality of your spam detection.We have been running a SQL-backed bayesian instance for several months, with remarkable results.read more