SA Bugzilla – Bug 5299
BayesStore/PgSQL: nonstandard use of \\ in a string literal warning
Last modified: 2007-11-24 15:26:29 UTC
When using PostgreSQL (8.2.1) as Bayes back-end the following warning is issued: $ sa-learn --ham 0.msg WARNING: nonstandard use of \\ in a string literal LINE 1: select put_tokens(1, '{"\\\\000\\\\172\\\\121\\\\370\\\\065"... ^ HINT: Use the escape string syntax for backslashes, e.g., E'\\'. Translation: an escaped string should use extended-standard syntax E'...' instead of a non-standard regular '...' (which is not supposed to use \escapes ). The attached patch fixes the problem (and for good measure also streamlines sub _quote_bytea). Here is a reference to PostgreSQL documentation: http://www.postgresql.org/docs/8.2/interactive/sql-syntax-lexical.html and a relevant quotatation: PostgreSQL also accepts "escape" string constants, which are an extension to the SQL standard. An escape string constant is specified by writing the letter E (upper or lower case) just before the opening single quote, e.g. E'foo'. [...] Caution If the configuration parameter standard_conforming_strings is off, then PostgreSQL recognizes backslash escapes in both regular and escape string constants. This is for backward compatibility with the historical behavior, in which backslash escapes were always recognized. Although standard_conforming_strings currently defaults to off, the default will change to on in a future release for improved standards compliance. Applications are therefore encouraged to migrate away from using backslash escapes. If you need to use a backslash escape to represent a special character, write the constant with an E to be sure it will be handled the same way in future releases.
Created attachment 3830 [details] promised patch to avoid the Pg warning
What version of DBD::Pg are you using? Can you confirm if this works on earlier version of PostgreSQL? You're using BYTEA as the type for the token column?
> What version of DBD::Pg are you using? perl -MDBD::Pg -le 'print DBD::Pg->VERSION' 1.49 perl -MDBI -le 'print DBI->VERSION' 1.53 > You're using BYTEA as the type for the token column? Yes. Schema is as per SA documentation in a sql subdirectory. mail_bayes=# \d bayes_token Table "public.bayes_token" Column | Type | Modifiers ------------+---------+---------------------------- id | integer | not null default 0 token | bytea | not null default ''::bytea spam_count | integer | not null default 0 ham_count | integer | not null default 0 atime | integer | not null default 0 Indexes: "bayes_token_pkey" PRIMARY KEY, btree (id, token) "bayes_token_idx1" btree (token) > Can you confirm if this works on earlier version of PostgreSQL? I'll see if I can put on PostgreSQL 7.4.15 on some host somewhere. It probaby suffices to check if it accepts the syntax E'...' for strings. Perhaps a look into docs would suffice. Btw, this three levels of quoting ('\\\\\\\\') looks kinda clumsy, I wonder if something cleaner could be devised. Mark
> Can you confirm if this works on earlier version of PostgreSQL? The E'...' syntax was introduced with PostgreSQL 8.1 (2005-11-08): http://www.postgresql.org/docs/8.1/interactive/release-8-1.html * Add E'' syntax so eventually ordinary strings can treat backslashes literally (Bruce) Currently PostgreSQL processes a backslash in a string literal as introducing a special escape sequence, e.g. \n or \010. While this allows easy entry of special values, it is nonstandard and makes porting of applications from other databases more difficult. For this reason, the PostgreSQL project is planning to remove the special meaning of backslashes in strings. [...] Note: While ordinary strings now support C-style backslash escapes, future versions will generate warnings for such usage and eventually treat backslashes as literal characters to be standard-conforming. The proper way to specify escape processing is to use the escape string syntax to indicate that escape processing is desired. Escape string syntax is specified by writing the letter E (upper or lower case) just before the string, e.g. E'\041'. This method will work in all future versions of PostgreSQL.
pushing out to 3.3.0, since I don't think it's a 3.2.0 blocker. shout (or change the milestone) if you disagree....
*** This bug has been marked as a duplicate of 5730 ***