Issue 57961

Summary: osl_getAbsoluteFileURL_impl_ calls lstat far too often
Product: udk Reporter: mikeleib <michael.leibowitz>
Component: codeAssignee: AOO issues mailing list <issues>
Status: UNCONFIRMED --- QA Contact:
Severity: Trivial    
Priority: P3 CC: hennes.rohling, issues, kay.ramme, kpalagin
Version: 680m137   
Target Milestone: ---   
Hardware: All   
OS: Unix, all   
Issue Type: DEFECT Latest Confirmation in: ---
Developer Difficulty: ---
Attachments:
Description Flags
nuke calls to _osl_resolvepath and rely on higher level file operations generating exceptions on non-existant paths none

Description mikeleib 2005-11-15 22:22:39 UTC
On startup, Openoffice.org runs lstat on /home more than 400 times.
Comment 1 mikeleib 2005-11-15 22:25:06 UTC
Created attachment 31529 [details]
nuke calls to  _osl_resolvepath and rely on higher level file operations generating exceptions on non-existant paths
Comment 2 kay.ramme 2005-11-16 10:10:17 UTC
I agree that these many "lstat" calls during startup are irritating, especially
(as you said already) as they are done on the exact same files over and over again.

If I understand correctly, most of these "lstat"s are triggered by calling
"realpath", which basically iterates over a file path and flattens it by
replacing sym. links etc., returning a resolved path. As most client code
actually does not care about the resolviness of a path, it is absolutly viable
to deal with unresolved pathes.

Unfortunately the OSL file API documentation for osl_getAbsolutFileURL states,
that the returned path has been resolved. So, changing the implementation
according to your patch is "somewhat" incompatible. Therefor I suggest to 
-A- either create a new function, which does not resolve the path and change all
code which does not rely on the resolviness to use this new function,
-B- or to change the behavior of osl_getAbsolutFileURL as suggested in your
patch, change the documentation accordingly and to provide a new function for
actually resolvings file URLs (which may be needed and which would otherwise not
be available).

-A- is obviously the safe but more work intense solution, where -B- is more
straight forward and simple, but also more risky in breaking things. Depending
on the time frame you would like to see your patches go in, I suggest to go with
-A- for 2.0.x releases or with -B- for 2.x.
Comment 3 kay.ramme 2005-11-22 09:21:16 UTC
Michael, any comments on my comments?
Comment 4 mikeleib 2005-11-29 19:26:36 UTC
I will pursue option A, but I have yet to schedule time to do so.  Is there a
preferred name for such a function?  osl_no_resolve_path doesn't have a nice
ring to it.

Additionally, I noticed that the comments tell me:  "In rtl/uri there is already
an URI parser etc. so this code should be consolidated."  Are there any plans to
do this?
Comment 5 kay.ramme 2005-12-12 10:28:34 UTC
What about "osl_getFileUrl" ? At least it is simple :-)

The comment in "sal/osl/unx/file_url.cxx" is likely from Stephan Bergmann.
Please contact him for further info.

Michael, I reassign this issue to you, but stay on CC:. If I understood
correctly, you volunteered to change the code.
Comment 6 kpalagin 2007-05-25 04:56:35 UTC
Any news?
Comment 7 mmeeks 2007-05-31 10:29:52 UTC
one data-point from our build where this was enabled initially:

# Don't stat /home a zillion times -- needs some love see iz comments
# noelp disabling this because it causes funny problem where 
# File::getAbsoluteFileURL returns incorrect 
# result ( which in turn causes java bootstraping problems for regcomp )

so, it causes at least some problems.
Comment 8 hennes.rohling 2007-05-31 14:50:16 UTC
IMHO the problem is not that osl_getAbsoluteFileURL resolves all path
components. But the question is "Why is osl_getAbsoluteFileURL called so often
during startup?"

When selecting solution -A- I bet most of the occurences which have to replaced
by a newly created function f.e. osl_getFileURLWithoutEllipse must not call such
a function at all.

Often it's not the implementation of a single function that causes performance
problems but that the expensive function is used without need.

It's the same as with osl_getDirectoryItem which is quite expensive. I often
removed code where the OSL file API was used to list the contents of a directory
with osl_openDirectory and osl_getNextDirectoryItem storing just the URL of each
file and call osl_getDirectoryItem for each URL.
Comment 9 Rob Weir 2013-07-30 02:20:26 UTC
Reset assignee on issues not touched by assignee in more than 2000 days.