SA Bugzilla – Bug 6171
Drugs rules don't detect runtogether words
Last modified: 2009-08-04 19:12:22 UTC
Created attachment 4502 [details] Anonymised sample from 090730 showing full body not hitting drug rules Rules in 20_drugs.cf try to detect straightforward names of particular drugs bounded with \b ; and also obfuscated names of drugs bounded with \b. An obfuscation seen recently is "CialisSuper" and "ViagraAs low as $1.85", which doesn't hit either rule because there is no \b boundary, and scores nothing on content. Care is particularly needed with 'cialis', which occurs within words such as 'specialism'.
Created attachment 4503 [details] Provisional patch to detect runtogether erectile drug names; fix rule description The patch detects changes of case after 'levitra' and 'cialis', so 'Ci@lisAs', for example, scores as an obfuscated drug name. For 'viagra', a different approach is used, so that 'viagralike', for example, even though in consistent case, scores as obfuscated. It also simplifies the definition of a word boundary as (\b|_). This patch is more of an example, and maybe similar changes should happen to other drug word endings; a similar check could happen at the first boundary to stop, e.g. 'BuyCIALIS!'. This patch also provides a more accurate description of the SUBJECT_FUZZY_MEDS which the sample *did* hit, although the word in the title was actually not obfuscated.