SA Bugzilla – Bug 7676
mkupdates produces different files with the same filenames causing issues with caching proxies
Last modified: 2022-04-19 13:27:39 UTC
https://wiki.apache.org/spamassassin/InfraNotes2017 documents when mkupdates is run: At 02:25 it "creates ${REVISION}.tar.gz ${REVISION}.tar.gz.sha1 and ${REVISION}.tar.gz.asc in /var/www/automc.spamassassin.org/updates for mirrors to pull" At 08:30 it "creates ${REVISION}.tar.gz ${REVISION}.tar.gz.sha1 and ${REVISION}.tar.gz.asc in /var/www/automc.spamassassin.org/updates for mirrors to pull" Several mirrors are not serving content directly, they're behind Cloudflare's caching proxy. User A downloads "${REVISION}.tar.gz" and "${REVISION}.tar.gz.sha1" at 03:00. They have GPG checking disabled, probably because of this issue. They get the content of the 02:25 files and everything works. User B downloads "${REVISION}.tar.gz", "${REVISION}.tar.gz.sha1" and "${REVISION}.tar.gz.asc" at 09:00. Cloudflare has cached "${REVISION}.tar.gz" and "${REVISION}.tar.gz.sha1". They get the content of 02:25 files. No one has requested "${REVISION}.tar.gz.asc" so Cloudflare does not have it cashed. They get the content of the 08:30 files. The content is different between the two runs of mkupdates but has the same filenames because the revision hasn't changed. I don't know why that is necessary or desirable but it does not interact well with caching proxies. The workaround is to block access to Cloudflare, but then bug 7662 happens.
Actually this is not true. 2:25 mkupdate-with-scores creates and publishes the tar.gz 8:30 run_nightly does NOT publish it again if mkupdate already did it, as we can see from this code: # Integrate with masscheck ruleset updates to prevent duplicates RECENT=`find $HOME/tmp/mkupdate-with-scores -name \*.tar.gz -mmin -480` if [[ -z "$RECENT" ]]; then echo "Recent ruleset from mkupdate-with-scores (massheck) NOT found." echo "Proceeding with a ruleset publish..." .... else echo "Recent ruleset from mkupdate-with-scores (massheck) found:" ls -l $RECENT echo "" fi The question is if the run_nightly code should ever do it (unnecessary duplicate code etc), but that's out of the scope of this bug. Closing.
It looks like this was changed after I first raised it: ------------------------------------------------------------------------ r1828938 | davej | 2018-04-11 22:44:28 +0100 (Wed, 11 Apr 2018) | 1 line Add check to the rules promotion to prevent duplicate rulesets 6 hours apart with the same name. ------------------------------------------------------------------------ r1829141 | davej | 2018-04-14 15:07:01 +0100 (Sat, 14 Apr 2018) | 1 line Had logic backwards for recent ruleset test from masscheck processing. ------------------------------------------------------------------------ The documentation on https://cwiki.apache.org/confluence/display/SPAMASSASSIN/InfraNotes2020 still claims that it gets regenerated twice.
I'll fix this on the wiki, also I'll change the status of this issue from workforme to fixed in r1828938, since it actually was
And back to worksforme, since this bug was opened after the fix was done, as I now see after looking at the dates.