Issue 117252

Summary: Data race due to non-atomic half-word accesses in sal/osl/unx/thread.c
Product: udk Reporter: olistraub <openoffice>
Component: codeAssignee: AOO issues mailing list <issues>
Status: CONFIRMED --- QA Contact:
Severity: Normal    
Priority: P3 CC: issues, stephan.bergmann.secondary
Version: DEV300m99   
Target Milestone: ---   
Hardware: PC   
OS: Linux, all   
Issue Type: DEFECT Latest Confirmation in: ---
Developer Difficulty: ---
Attachments:
Description Flags
Adds padding members none

Description olistraub 2011-03-07 09:18:20 UTC
thread.c defines Thread_Impl as:

typedef struct osl_thread_impl_st
{
        pthread_t                       m_hThread;
        sal_uInt16              m_Ident; /* @@@ see TODO @@@ */
        short               m_Flags;
    oslWorkerFunction   m_WorkerFunction;
        void*                           m_pData;
        pthread_mutex_t         m_Lock;
        pthread_cond_t          m_Cond;
} Thread_Impl;

m_Ident and m_Flags are half-words mapping onto the same 32-bit word. Therefore writes to one of these half-words could conflict with writes to the other half word.

Suggestion: add padding members to have each member in its own word.
Comment 1 olistraub 2011-03-07 09:19:02 UTC
Created attachment 76040 [details]
Adds padding members
Comment 2 kay.ramme 2011-03-07 09:56:00 UTC
Hi Oli, as far as I can see, all accesses to m_Ident respectively m_flags are either protected by acquiring m_Lock or be ensuring that only one reference exists.

Am I missing anything?
Comment 3 olistraub 2011-03-08 08:10:15 UTC
I'm currently evaluating Intel Inspector on our code using UNO, and it's telling me the following:

Data race between the following lines (and in this execution order):

osl_joinWithThread: pImpl->m_Flags &= ~THREADIMPL_FLAGS_ATTACHED;

and 

osl_getThreadIdentifier: Ident = pImpl->m_Ident;

Sadly the tool didn't report complete stack traces, so I don't know who's calling joinWithThread and getThreadIdentifier.

So I assume the following:
thread B wants to join thread A and write the m_Flags member (half-word write)
thread A is still executing and reads its thread id -> data race

However, the thread id should still be valid, even without adding padding. The half-word write can't temporarily destroy the other half-word, can it?
Comment 4 Stephan Bergmann 2011-03-08 08:19:52 UTC
What is unclean in the code (and apparently triggers the complaint) is that osl_getThreadIdentifier accesses m_Ident with m_Lock not locked (while osl_thread_start_Impl accesses it with m_Lock locked).

One could argue that the complaint is false, though:  For one, proper visibility by all threads of m_Ident written by osl_thread_start_Impl is guaranteed at read of m_Ident in osl_getThreadIdentifier, as there is synchronization on m_Cond/m_Lock in between.  For another, modifications of m_Flags should not clobber m_Ident (it would be a problem if m_Ident written to instead of only read from in osl_getThreadIdentifier).

Anyway, changing osl_getThreadIdentifier to lock m_Lock before accessing m_Ident would make the code cleaner.
Comment 5 kay.ramme 2011-03-08 08:42:48 UTC
Writes to m_Flags may not change m_Ident, even not temporarily because out-of-order-execution or instruction reordering.

Locking the structure likely fixes the complaint, another fix is adding some padding as suggested by Oli.

->Oli: Could you verify that acquiring the lock before accessing m_Ident fixes the complaint?

->SB: As you said, the systematical right fix is acquiring the lock. Though for performance reasons I suggest to go with the padding approach. Comments?
Comment 6 Stephan Bergmann 2011-03-08 08:59:42 UTC
"Writes to m_Flags may not change m_Ident, even not temporarily [due to] out-of-order-execution or instruction reordering."  At least theoretically this is wrong.  There is nothing in C++ nor Pthreads that would prohibit this.  Practically though, this is more than unlikely (hence I used "should not").

"[The] systematical right fix is acquiring the lock.  Though for performance reasons I suggest to go with the padding approach."  I would opt for locking.  For one, stick to "first make it right, then make it fast" (what makes you assume that locking here will degrade overall performance?).  For another, padding would not make it that much better: as we argued, for practical purposes padding should not be needed; but for theoretical purposes, padding 16 bits between m_Ident and m_Flags would not change anything; the only thing that would be improved is the interaction with tools like Intel Inspector (which is improved by locking just as well).
Comment 7 kay.ramme 2011-03-08 09:09:04 UTC
Padding does address the complaints, practically solving this issue without the need to worry anymore, changing performance characteristics may lead to rework ;-)

Anyway, please take over and change according to your preferences.
Comment 8 olistraub 2011-03-08 10:49:33 UTC
I checked the padding approach, this seems to satisfy the Inspector.

I will not try the locking approach, since this will work anyway to get rid of the error.
Comment 9 Oliver-Rainer Wittmann 2012-06-13 12:23:41 UTC
getting rid of value "enhancement" for field "severity".
For enhancement the field "issue type" shall be used.