Issue 53682 - sal / nfs locking bug ...
Summary: sal / nfs locking bug ...
Status: CLOSED DUPLICATE of issue 54586
Alias: None
Product: porting
Classification: Code
Component: code (show other issues)
Version: 680m123
Hardware: All All
: P3 Trivial with 2 votes (vote)
Target Milestone: OOo 2.0.1
Assignee: Stephan Bergmann
QA Contact: issues@porting
URL:
Keywords:
: 36634 (view as issue list)
Depends on:
Blocks:
 
Reported: 2005-08-23 17:35 UTC by mmeeks
Modified: 2005-09-14 12:48 UTC (History)
5 users (show)

See Also:
Issue Type: PATCH
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
patch (1.50 KB, patch)
2005-08-23 17:36 UTC, mmeeks
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this issue.
Description mmeeks 2005-08-23 17:35:55 UTC
With newer kernels we get:
   4678 fcntl64(31, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = -1
ENOLCK (No locks available)
instead of:
   23655 fcntl64(34, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = -1
EACCES (Permission denied)

our flock code is not defensive & doesn't cover this case. This patch at least
handles ENOLCK - you can argue about what we should do in this case - obviously
;-) - I accept the lock.
Comment 1 mmeeks 2005-08-23 17:36:42 UTC
Created attachment 28992 [details]
patch
Comment 2 mmeeks 2005-08-23 17:38:41 UTC
For the uninitiated this means that with OO.o on SUSE 10.0 - when you try to
open a file on an NFS mount you get a dialog saying:
     "General input output error while accessing: ..."
Comment 3 caolanm 2005-08-23 19:21:33 UTC
*** Issue 36634 has been marked as a duplicate of this issue. ***
Comment 4 Martin Hollmichel 2005-09-08 08:15:18 UTC
mh->sb: please review.
Comment 5 Stephan Bergmann 2005-09-12 12:31:11 UTC
@mmeeks:  Two issues:

1  Your patch, unlike the original code, calls fcntl even when
aflock.l_type==F_WRLCK.  I assume that is not intended.

2  According to SUSV3, fcntl/F_SETLK may return ENOLCK when "satisfying the lock
or unlock request would result in the number of locked regions in the system
exceeding a system-imposed limit", and what the comment in your patch considers
"locking not supported" is, again according to SUSV3, indicated by EINVAL
instead of ENOLCK.  Following SUSV3, I would thus assume that ENOLCK should
cause osl_openFile to fail (and that instead EINVAL might be handled in a
special way, ignoring the lock request).
Under what conditions do you experience ENOLCK on SuSE 10.0?  Always?
Comment 6 Stephan Bergmann 2005-09-12 12:39:02 UTC
erm, that should instead read

"1  Your patch, unlike the original code, calls fcntl even when
aflock.l_type!=F_WRLCK.  I assume that is not intended."
Comment 7 mmeeks 2005-09-12 15:27:46 UTC
So - 1. yes you assume right - I guess we need some more flow furkling there to
not call the fcntl in that case ' I guess an 'else if ( -1 == ...) instead of an
'if' in that chain would do it. Luckily the problem is not that important for
SL10.0 where it shipped ;-)

Wrt. 2 - I assume you're happy that this happens very commonly on more recent
kernels ? ;-) clearly we need to handle that error case.
Comment 8 Stephan Bergmann 2005-09-12 16:25:59 UTC
But you don't know *why* it commonly happens, right?  My dilemma is as follows:
 Without further information under which conditions exactly this happens, it
sounds to me like a bug in the kernel (or at least a deviation from Posix), and
the patch looks like a workaround for that bug.  However, on other platforms,
where ENOLCK indicates the (unlikely) event of resource exhaustion, instead of
"locking not supported" (which your comments imply that it does mean on recent
Linux kernels), the workaround makes the code worse (silently ignoring the
request to lock a file), not better.  Hence, my reluctance to include it as-is.
Comment 9 Stephan Bergmann 2005-09-14 10:36:23 UTC
I presume that the experienced problems of ENOLCK described in this issue are
due to mis-configured Linux machines where no NFS lock daemon is running.

*** This issue has been marked as a duplicate of 54586 ***
Comment 10 Stephan Bergmann 2005-09-14 10:40:34 UTC
.
Comment 11 mmeeks 2005-09-14 11:21:04 UTC
Stefan - you're thinking 'correctness' - I'm thinking "user experience". What
the user wants is to load their document :-)

Wrt. ENOLCK - AFAICS this is a -completely- valid thing to return if the file
can in fact not be locked. It's prolly a bug that this wasn't returned
previously - and we just got a different error code that the higher-level code
ignored in a more sensible way ;-) I'd put good money on that.

The scenario is simple to reproduce, install OpenSUSE 10.0 - or any recent Linux
distro (I believe RH have the same problem) - and try to open a file on an NFS
share [ by default it seems you can't reliably lock files via NFS ]. I'm sure
there are prolly other file systems that you can't lock with - disconnected /
synching / stacked pieces / etc. 

At the end of the day - "I can't open my document" is pretty painful ;-)
potentially of course, we need to teach the upper level code about this and then
get it to ignore that error in the same way [ or - to vandalise the user
experience we could throw up a dialog saying "this document cannot be locked -
can I annoy you again ? <sometimes> <always>" ;->  ... up to you ;-]

propagating the error up reliably & handling it in every case in this way would
of course be a more elegant solution I guess.
Comment 12 Stephan Bergmann 2005-09-14 12:48:08 UTC
@mmeeks:  The mention of recent Linux kernels was a red herring that got me
confused.  The problem is not related to the vintage of the kernel, but rather
to a broken system setup (which may well be a default system setup of some
distro or other).  Issue 54586 is all about addressing this problem in a
user-friendly way.

Again: the patch you supplied IMO does not address the problem appropriately, as
it silently alters critical behaviour (when ENOLCK is returned because some
system limit has been reached, and a file would be opened unlocked, but the user
would expect it to be locked, which can lead to data loss if two people edit the
same file at the same time).  If you have further concerns, please continue on
issue 54586.