|
SA Bugzilla – Full Text Bug Listing |
Summary: | Use unsigned database fields for int | ||
---|---|---|---|
Product: | Spamassassin | Reporter: | Michiel Hazelhof <michiel> |
Component: | Documentation | Assignee: | SpamAssassin Developer Mailing List <dev> |
Status: | NEW --- | ||
Severity: | enhancement | CC: | michiel |
Priority: | P5 | ||
Version: | unspecified | ||
Target Milestone: | Undefined | ||
Hardware: | All | ||
OS: | All | ||
Whiteboard: | |||
Attachments: |
Test for the sql tables
New revision #1 bayes_mysql #1 userpref_mysql #1 awl_mysql #1 |
Description
Michiel Hazelhof
2011-09-21 19:43:39 UTC
Not sure wether `last_atime_delta` and `last_expire_reduce` could be unsigned, as I haven't got enough test data. Latin saves about 1.2% (database copy) but I can't verify whether the database is still intact or not, Correction, didn't take the correct measurement regarding the utf8 vs latin, latin costs about 1/3 compared to utf8. > My patch includes awl_mysql.sql, bayes_mysql.sql and userpref_mysql.sql Where is the patch? Should probably be attached here. > As a secondary question should UTF-8 or another? Which fields? A fix for a MySQL data type of bayes_token.token is already in trunk, it should be binary (just as it already is for PostgreSQL) as it contains octets which are not associated with any character set. See Bug 6625. As for other fields, like username or IP address or preference or a value of a preference, I don't know - these should probably also not be associated with any particular character set so a varbinary would probably suit them best, but ascii (or Latin-1) is just as good as they are not supposed to contain any characters outside of the printable ascii set. Created attachment 4965 [details]
Test for the sql tables
Sorry I forgot to add the sql file. Setting fields like bayes_token.id to unsigned , allows them to contain higher numbers and potentially avoid some problems. All fields which never contain negative integers should be unsigned (unsigned: 0 to 4,294,967,295 and signed: -2,147,438,648 to 2,147,438,647). I am not sure about the following fields regarding to unsigned: bayes_expire.runtime bayes_vars.last_atime_delta bayes_vars.last_expire_reduce As for UTF8 it reduces memory usage a bit (I guess, just my measurements) and is capable of holding nearly all possible characters (that should be used anyway). Also defining a default set should help with debugging purposes as well since everybody should be having the same set. Created attachment 4966 [details]
New revision #1
Created attachment 4967 [details]
bayes_mysql #1
Created attachment 4968 [details]
userpref_mysql #1
Created attachment 4969 [details]
awl_mysql #1
Todo: Determine whether the following fields can be negative: bayes_expire.runtime, bayes_vars.last_atime_delta, bayes_vars.last_expire_reduce Add Upgrade notice for table changes (UTF-8 + unsigned), also for Bug 6625 I guess (needed to empty my table for this one). |