Bug 37356

Summary:	Tomcat does not invalidate sessions after session-timeout period has passed.
Product:	Tomcat 5	Reporter:	Eddie Wynn <eddiew>
Component:	Catalina	Assignee:	Tomcat Developers Mailing List <dev>
Status:	RESOLVED FIXED
Severity:	normal	CC:	amasson, kldavis, suzuki.yuichiro
Priority:	P2
Version:	Nightly Build
Target Milestone:	---
Hardware:	Other
OS:	other
Attachments:	Stress Test Framework A JSP That can be used to see the accessCount of a session

Description Eddie Wynn 2005-11-04 15:38:08 UTC

I am encountering a problem with Tomcat 5.0.28. I have an application which 
times users sessions out after 5 minutes of inactivity. 

I have written some extensions to the manager app that allow me to list 
sessions for a given context and also to force an invalidation of sessions 
that have been idle for over a specified period of time.

Using these tools I can see that I have a lot of sessions with an idle_time 
far in excess of 5 minutes - values of over 24 hours are not uncommon.

Using my manager extensions I am then able to force these sessions to 
invalidate, at which point my HttpSessionBindingListener class (valueUnbound 
method) is invoked and logs the user connected to that session out, before 
invalidating the session and it being removed from the list of sessions. 

The fact that the user log out can take place without any problems on 
these 'stale' sessions seems to indicate that the problem is that the session 
has not been invalidated - as in Tomcat has made no attempt to invalidate the 
session after the idle expiry time has passed, there are no error messages 
posted in any logs and try as I might I am unable to reproduce this behaviour 
on 2 other test systems even when completing the same actions as the 
production system users do that cause this problem.

Clearly I have checked that the configuration is correct - indeed some 
sessions will be timed out as expected so there cannot be a configuration 
problem.

Comment 1 Yoav Shapira 2005-11-28 21:52:01 UTC

Is something holding a reference to those sessions?

And does this happen with 5.0.30 or 5.5.12?

Comment 2 Eddie Wynn 2005-12-14 13:30:56 UTC

(In reply to comment #1)
> Is something holding a reference to those sessions?
> And does this happen with 5.0.30 or 5.5.12?

Thanks for the interest.

This happens with 5.0.28 in production - I cannot reproduce it anywhere else 
(using 5.0.28) and I currently cannot migrate the production system to 5.0.30 
either. So, in short I cannot say if it happens on anything other than 5.0.28 -
 if someone could say that this behaviour was different in 5.0.30 then I would 
be able to move forwards but without something more definitive I can't.

I am not sure what you are getting at with the reference holding - if you mean 
is something keeping the session alive i.e. polling or such like then the 
answer is no. If you mean is a Java object keeping a reference to it - then 
aside from what Tomcat does internally (where I expect it must do...) then the 
answer is also no.

Hope this helps. Eddie

Comment 3 Tobias Meyer 2006-01-12 16:58:10 UTC

We have a similar probelm - but only when tomcat (5.0.28 as well) is connected
through ajp13.

As a difference to the original poster we do keep a reference to the Session
Objects from within the servlet. This list is updated when we see a new session
or when the valueUnbound() of an object (which is placed in every session) is
called.

This problem occurs on Windows (with IIS as frontend) as well as on Linux with
Apache-httpd, but not if Tomcat is beeing run stand-alone (Windows or Linux).

Comment 4 Torr Liu 2006-01-24 12:59:34 UTC

I have found one problem about the accessCount. 

In some times, it will not decrease to 0 after one request. So the session 
manger will never invalidate it.

The Tomcat version is 5.0.19.

Comment 5 Eddie Wynn 2006-01-26 17:16:01 UTC

(In reply to comment #4)
> I have found one problem about the accessCount. 
> In some times, it will not decrease to 0 after one request. So the session 
> manger will never invalidate it.
> The Tomcat version is 5.0.19.

This sounds like you might be onto something - how did you find that? Can you 
reproduce this at will?

Comment 6 Torr Liu 2006-02-07 06:10:38 UTC

It's very difficult to recure this problem. I have write a very complex web 
application. and mostly it will work fine. but some time, it will occur this 
problem.

This problem mostly ocurred when two browser access the same page.

And the web application will do following things:
1. APP A will include some pages in other applications.
2. There is a request wrapper. when include the page in other applications and 
try to get the session object using request.getSession(), it will get the 
session which is generated in App A.
3. The session objects generated in A, will be managed by our system. (of 
cuase , there is one listener to remove the session when it timeout)
Sorry, I can only provide these info for you.

Comment 7 Tobias Meyer 2006-02-07 09:26:02 UTC

(In reply to comment #6)
> 
> This problem mostly ocurred when two browser access the same page.
>
Do you mean that wo browsers use the same session to access a page, or just the
( quite usual ) case that two people use the web application?
 
> And the web application will do following things:
> 1. APP A will include some pages in other applications.
> 2. There is a request wrapper. when include the page in other applications and 
> try to get the session object using request.getSession(), it will get the 
> session which is generated in App A.
That means in the included application B and C you get the session from
application A?

Application != context for you?

> 3. The session objects generated in A, will be managed by our system. (of 
> cuase , there is one listener to remove the session when it timeout)
> Sorry, I can only provide these info for you.

So it seems that the common ground for all of us here is, that we have some
objects in the session that implement the HttpSessionBindingListener or
HttpSessionActivationListener.

Maybe the cause lies within these?

Comment 8 Torr Liu 2006-02-07 15:40:53 UTC

Do you mean that wo browsers use the same session to access a page, or just the
( quite usual ) case that two people use the web application?

A: not the same session. Different IE browser will generate differnt session 
ID.

That means in the included application B and C you get the session from
application A?

Application != context for you?

A: yes. 


So it seems that the common ground for all of us here is, that we have some
objects in the session that implement the HttpSessionBindingListener or
HttpSessionActivationListener.

A: I did not put any object implement the HttpSessionBindingLisener or 
HttpSessionActivationListener in the session.


I think with different thread ( when include the page in other application), 
when it will operate the session object, it will not synchronize the session 
object (or its filed accessCount).

Comment 9 Torr Liu 2006-02-07 15:52:04 UTC

Please check this defect: 28846

I think it should be the same problem. :-)

Comment 10 Tobias Meyer 2006-02-07 16:03:55 UTC

(In reply to comment #8)
> Do you mean that wo browsers use the same session to access a page, or just the
> ( quite usual ) case that two people use the web application?
> 
> A: not the same session. Different IE browser will generate differnt session 
> ID.
> 
> That means in the included application B and C you get the session from
> application A?
> 
> Application != context for you?
> 
> A: yes.

Ok, so you can replicate the error if:
1. you have two or more simultaneous users accessing a page in your web application.
2. you have more than one application (servlet / jsp) within the same context,
so the session stays the same through all requests.
3. one of your servlets includes the output from other servlets/jsps

Did I get that right?
 
> 
> 
> So it seems that the common ground for all of us here is, that we have some
> objects in the session that implement the HttpSessionBindingListener or
> HttpSessionActivationListener.
> 
> A: I did not put any object implement the HttpSessionBindingLisener or 
> HttpSessionActivationListener in the session.
> 
Ah - I had thought so, because you mentioned you remove the objects from the
session when it expires. The only way known to me to do this is to implement one
of those Interfaces (can't remember which one currently)

> 
> I think with different thread ( when include the page in other application), 
> when it will operate the session object, it will not synchronize the session 
> object (or its filed accessCount).
>
That seems reasonable. If the access count is not properly synchronized it will
indeed become screwed eventually.
Yet it is strange that it will only occur in our application if we use the ajp13
connector with SSL requests. 
SSL on tomcat standalone - everyting OK.
Plain http through Apache httpd (Linux) with ajp13 - everything OK
SSL through Apache httpd (Linux) with ajp13 - No Session timeouts
SSL through IIS with ajp13 - No Session timeouts
Basically the same application on all machines...

I'll try and stress-test my testserver a little and see if I can reproduce it
with enough simultaneous requests

Comment 11 Tobias Meyer 2006-02-07 16:15:09 UTC

(In reply to comment #9)
> Please check this defect: 28846
> 
> I think it should be the same problem. :-)

But that bug applies to the tapestry project.
Is the engine mentioned in that bug the tapestry engine or the tomcat engine?

Comment 12 Lothsahn 2006-02-07 18:46:11 UTC

(In reply to comment #11)
> (In reply to comment #9)
> > Please check this defect: 28846
> > 
> > I think it should be the same problem. :-)
> 
> But that bug applies to the tapestry project.
> Is the engine mentioned in that bug the tapestry engine or the tomcat engine?


We're seeing this same issue, as well.  We're using tomcat 5.5.4, and we appear
to only see it in environments under heavy load when using the ajp13 connector.
 We have another customer redirecting from IIS and they're not experiencing this
issue.

I spent a couple hours looking at the code and I didn't see anything weird. 
What I can tell you is that for the sessions this occurs on, their inactivity
time IS properly set, but they still don't get invalidated.  When I call
invalidate on those sessions, they ARE invalidated.

We're not using the tapestry project at all--just Tomcat and Apache (or Tomcat
and IIS).

Those sessions which fail to invalidate stay around forever--or seemingly so. 
I've seen inactivity times in the 2500+ minutes range.

Comment 13 Lothsahn 2006-02-07 18:49:01 UTC

(In reply to comment #10)
> (In reply to comment #8)
> > Do you mean that wo browsers use the same session to access a page, or just the
> > ( quite usual ) case that two people use the web application?
> > 
> > A: not the same session. Different IE browser will generate differnt session 
> > ID.
> > 
> > That means in the included application B and C you get the session from
> > application A?
> > 
> > Application != context for you?
> > 
> > A: yes.
> 
> Ok, so you can replicate the error if:
> 1. you have two or more simultaneous users accessing a page in your web
application.
> 2. you have more than one application (servlet / jsp) within the same context,
> so the session stays the same through all requests.
> 3. one of your servlets includes the output from other servlets/jsps
> 
> Did I get that right?
>  
> > 
> > 
> > So it seems that the common ground for all of us here is, that we have some
> > objects in the session that implement the HttpSessionBindingListener or
> > HttpSessionActivationListener.
> > 
> > A: I did not put any object implement the HttpSessionBindingLisener or 
> > HttpSessionActivationListener in the session.
> > 
> Ah - I had thought so, because you mentioned you remove the objects from the
> session when it expires. The only way known to me to do this is to implement one
> of those Interfaces (can't remember which one currently)
> 
> > 
> > I think with different thread ( when include the page in other application), 
> > when it will operate the session object, it will not synchronize the session 
> > object (or its filed accessCount).
> >
> That seems reasonable. If the access count is not properly synchronized it will
> indeed become screwed eventually.
> Yet it is strange that it will only occur in our application if we use the ajp13
> connector with SSL requests. 
> SSL on tomcat standalone - everyting OK.
> Plain http through Apache httpd (Linux) with ajp13 - everything OK
> SSL through Apache httpd (Linux) with ajp13 - No Session timeouts
> SSL through IIS with ajp13 - No Session timeouts
> Basically the same application on all machines...
> 
> I'll try and stress-test my testserver a little and see if I can reproduce it
> with enough simultaneous requests

We're seeing this issue on our NON-SSL Apache ajp13 system...  so I don't think
SSL is a requirement for this issue to occur.

Comment 14 Eddie Wynn 2006-02-08 10:13:28 UTC

> We're seeing this issue on our NON-SSL Apache ajp13 system...  so I don't 
think
> SSL is a requirement for this issue to occur.

All sessions to my instance of Tomcat are HTTP - they all come from a Proxy 
(Apache) which translates HTTPS to HTTP - so I produce this bug with no SSL 
and no ajp13.

Comment 15 Tobias Meyer 2006-02-08 10:21:07 UTC

> We're seeing this same issue, as well.  We're using tomcat 5.5.4, and we appear
> to only see it in environments under heavy load when using the ajp13 connector.
>  We have another customer redirecting from IIS and they're not experiencing this
> issue.
> 
> I spent a couple hours looking at the code and I didn't see anything weird. 
> What I can tell you is that for the sessions this occurs on, their inactivity
> time IS properly set, but they still don't get invalidated.  When I call
> invalidate on those sessions, they ARE invalidated.
> [...] 
> Those sessions which fail to invalidate stay around forever--or seemingly so. 
> I've seen inactivity times in the 2500+ minutes range.
> 
This all fits into the threads issue theory.
Tomcat maintians a reference count for each session which is increased at the
beginning of a request and decreased at he end of the request.

If - for whatever reason - this reference count does not reach zero after all
requests have finished (e.g. because one thread was increasing while another was
decreasing, or an exception occurred which is not propably handled or... )
tomcat will not invalidate the session because it thinks there is still some
request currently running (which basically is a good thing. I do generally NOT
want my session to expire while a request is still running: e.g. a large download )

But this is still only a theory - I need to find a few free minutes to implement
a test case...

Comment 16 Lothsahn 2006-02-08 18:11:42 UTC

(In reply to comment #15)
> > We're seeing this same issue, as well.  We're using tomcat 5.5.4, and we appear
> > to only see it in environments under heavy load when using the ajp13 connector.
> >  We have another customer redirecting from IIS and they're not experiencing this
> > issue.
> > 
> > I spent a couple hours looking at the code and I didn't see anything weird. 
> > What I can tell you is that for the sessions this occurs on, their inactivity
> > time IS properly set, but they still don't get invalidated.  When I call
> > invalidate on those sessions, they ARE invalidated.
> > [...] 
> > Those sessions which fail to invalidate stay around forever--or seemingly so. 
> > I've seen inactivity times in the 2500+ minutes range.
> > 
> This all fits into the threads issue theory.
> Tomcat maintians a reference count for each session which is increased at the
> beginning of a request and decreased at he end of the request.
> 
> If - for whatever reason - this reference count does not reach zero after all
> requests have finished (e.g. because one thread was increasing while another was
> decreasing, or an exception occurred which is not propably handled or... )
> tomcat will not invalidate the session because it thinks there is still some
> request currently running (which basically is a good thing. I do generally NOT
> want my session to expire while a request is still running: e.g. a large
download )
> 
> But this is still only a theory - I need to find a few free minutes to implement
> a test case...

I don't mean to throw a wrench in the works, but if that's true, wouldn't that
mean calling invalidate on the session later would NOT invalidate the session,
or does invalidate ignore the session access count?

Also, I just wanted to confirm that there are no large files to download in our
system, so all of these users definately have an inactive connection--we're not
seeing normal system behavior waiting on files.

Comment 17 Tobias Meyer 2006-02-09 09:57:26 UTC

(In reply to comment #16)
> I don't mean to throw a wrench in the works, but if that's true, wouldn't that
> mean calling invalidate on the session later would NOT invalidate the session,
> or does invalidate ignore the session access count?
> 
> Also, I just wanted to confirm that there are no large files to download in our
> system, so all of these users definately have an inactive connection--we're not
> seeing normal system behavior waiting on files.
> 
Most of all this happens in org.apache.catalina.session.StandardSession and
org.apache.catalina.session.StandardManager.

Calling invalidate() on the session more or less directly maps through to
expire(), which will simply set the accessCount to 0 and remove the session from
the sessionManager.

After spending the whole of yesterday afternoon debugging I can also confirm,
that it is indeed the accesssCount running out of sync. Alas, I still can not
reproduce the problem. ( we found the wrong accessCount by running the failing
production machine in debug mode and connecting to it with jdb)

On my local (Windows XP) development machine with Sun-jdk 1.5.0_06, tomcat
5.0.28 and apache 2.0.55 connected through ajp13 (same setup as production,
except for the OS) this simply does not occur.

I also have second thoughts about the race condition theory, because one would
expect, that such a thing would happen only once in a while - but on the servers
we are seeing the issue on, it happens all the time (if not every time)...

Comment 18 Lothsahn 2006-03-01 15:51:25 UTC

Has there been any update on this bug?

Comment 19 Arnaud Masson 2006-03-09 18:22:47 UTC

I have the same problem, some sessions never expire (tomcat 5.5.12, jdk 1.5.0).
It seems that the session accessCount is not correctly decremented.

In my application, a single web browser can send several asynchronous
XmlHttpRequests at the same time, so there are concurrent accesses on the server
sesssion.

I have a look at tomcat source code and it seems that the session validy
management is not always 'synchronized', so I agree with the "race condition
theory" ...

Comment 20 Tobias Meyer 2006-03-10 08:03:34 UTC

(In reply to comment #19)
> 
> I have a look at tomcat source code and it seems that the session validy
> management is not always 'synchronized', so I agree with the "race condition
> theory" ...
> 
Yes, that were my thoughts as well, but the ++ and -- operations (as in
sessionCount++) are atomic because sessionCount is an int. They should not need
to be synchronized according to the virtual machine specifications.

On another note: We also recently moved one perfectly well working installation
to a new server, and suddntly we encounter this issue there as well.
Previous installation:
Single-Intel-CPU 
SuSE 9.0 Linux with kernel 2.6.11
Tomcat 5.0.29 - standalone
jdk1.5.0_02

current installation:
Dual AMD-x64 CPUs 
SuSE 9.3 (mostly out-of the box) with kernel 2.6.14
Tomcat 5.0.30
Apache httpd
mod_jk
jdk1.5.0_06 (x64)

So the trigger has to be either the 64bit CPU / Java or Tomcat and mod_jk, or
something in my configuration (the application is the same .war file)

I do not have local access to any x64 machine, so I cannot debug that here,
maybe someone else can jump in?

Comment 21 Arnaud Masson 2006-03-10 08:40:13 UTC

(In reply to comment #20)

> Yes, that were my thoughts as well, but the ++ and -- operations (as in
> sessionCount++) are atomic because sessionCount is an int. They should not need
> to be synchronized according to the virtual machine specifications.

Generally this is not enough to have thread-safe code. For instance this kind of
code is not thread-safe:
  this.x--;
  if (this.x == 0) ...
In this case AtomicInteger.decrementAndGet can be used.

> 
> So the trigger has to be either the 64bit CPU / Java or Tomcat and mod_jk, or
> something in my configuration (the application is the same .war file)

Threading problems (ie lack of synchronization) are more visible on multi-CPU or
multi-core configurations.

Comment 22 Tobias Meyer 2006-03-10 08:54:36 UTC

> 
> Generally this is not enough to have thread-safe code. For instance this kind of
> code is not thread-safe:
>   this.x--;
>   if (this.x == 0) ...
> In this case AtomicInteger.decrementAndGet can be used.

Of course, but as far as I could tell, in the tomcat session-management there is
a sessionCount-- for every sessionCount++, so in the end there should be a total
sessionCount of 0. Maybe one of the checks for sessionCount==0 does not always
immediately give the 'correct' result - but in the long run it has always to
return to zero.


> 
> > 
> > So the trigger has to be either the 64bit CPU / Java or Tomcat and mod_jk, or
> > something in my configuration (the application is the same .war file)
> 
> Threading problems (ie lack of synchronization) are more visible on multi-CPU or
> multi-core configurations.
> 
I'll have to check if all machines we see this on are multi-core...

Comment 23 Remy Maucherat 2006-03-10 11:08:43 UTC

(In reply to comment #19)
> I have the same problem, some sessions never expire (tomcat 5.5.12, jdk 1.5.0).
> It seems that the session accessCount is not correctly decremented.
> 
> In my application, a single web browser can send several asynchronous
> XmlHttpRequests at the same time, so there are concurrent accesses on the server
> sesssion.
> 
> I have a look at tomcat source code and it seems that the session validy
> management is not always 'synchronized', so I agree with the "race condition
> theory" ...
> 

It's possible but rare. -1 for adding syncs, though (feel free to use a patched
Tomcat), as accessCount is a gimmick to support stupid usage scenarios
(basically, people who were using really long running requests with really short
expiration times). Either a smarter way of implementing this could be used, or
the session could simply be expired if it becomes inactive for a really long
period (say, one hour, or 5 times the regular timeout, whichever is greater).

Comment 24 Tobias Meyer 2006-03-10 11:15:53 UTC

Proposed patch:

in org.apache.catalina.session.StandardSession
 line 284 (tomcat 5.0.28)

    /**
     * The access count for this session.
     */
    protected transient int accessCount = 0;

has to be replaced by 

    /**
     * The access count for this session.
     */
    protected volatile transient int accessCount = 0;

Explanation:
The expression accessCount++ is not thread-safe because each thread may hold a
local copy of the variable which does not have to be synchronized with main
memory immediately.
The volatile keyword forces this synchronization.

See the discussion on 
http://forum.java.sun.com/thread.jspa?threadID=604831&start=30&tstart=0

[quote]
2) Local vs. main mem: Threads have--or can have--their own local copies of
shared variables. When T1 writes a value to a variable, if the variable is not
volatile and there's no syncing, that new value may live only in T1's local copy
and may never get written to main mem, which means other threads may never see
that new value. Entering and leaving sync blocks forces a reconciliation between
the thread's local memory and main mem. Declaring a variable volatile forces
every read and write of that variable to go to main mem.
[end quote]

Another solution would be to synchronize the code segments.

I have to apologize for my claim that ++ and -- were atomic for int datatypes. 
This seems to be only true for single-CPU systems (without hyperThreading).

Comment 25 Remy Maucherat 2006-03-10 11:28:16 UTC

(In reply to comment #24)
> Proposed patch:

-1. Pretend you read my comment.

Comment 26 Tobias Meyer 2006-03-10 12:04:24 UTC

(In reply to comment #25)
> (In reply to comment #24)
> > Proposed patch:
> 
> -1. Pretend you read my comment.

Well, apart from the fact that I wrote my comment while you posted yours
(mid-air-collision)

I don't see any obvious reason against syncing the accessCount with the volatile
keyword. The accesscount obviously needs to be synchronized in some way (or be
removed, which I don't fancy because of large, time consuming, downloads).
Of course I would be happy to improve my understanding, so please explain. 

For the "rare" issue. We see quite some of these stale sessions. (appx. 2-10 a day)
I'm not saying that this is a major security issue, but it over time it gives an
attacker quite a chance to guess some sessionId.

Many people will not even be aware of this issue, because you can only see it if
you keep track of sessions yourself. All others might take a look into the
manager application and enjoy the number of concurrent users they usually have,
not knowing those sessions should have expired a long time ago.

Comment 27 Remy Maucherat 2006-03-10 12:51:51 UTC

(In reply to comment #26)
> Well, apart from the fact that I wrote my comment while you posted yours
> (mid-air-collision)
> 
> I don't see any obvious reason against syncing the accessCount with the volatile
> keyword.

And you verified it actually fixed this "issue" ?

> For the "rare" issue. We see quite some of these stale sessions. (appx. 2-10 a
day)
> I'm not saying that this is a major security issue, but it over time it gives an
> attacker quite a chance to guess some sessionId.
> 
> Many people will not even be aware of this issue, because you can only see it if
> you keep track of sessions yourself. All others might take a look into the
> manager application and enjoy the number of concurrent users they usually have,
> not knowing those sessions should have expired a long time ago.

As usual, 2-3 useless paragraphs as soon as there's a disagreement with a user.
Special bonus for the security fud ;)

Comment 28 Lothsahn 2006-03-10 16:17:31 UTC

(In reply to comment #24)
> Proposed patch:
> 
> in org.apache.catalina.session.StandardSession
>  line 284 (tomcat 5.0.28)
> 
>     /**
>      * The access count for this session.
>      */
>     protected transient int accessCount = 0;
> 
> has to be replaced by 
> 
>     /**
>      * The access count for this session.
>      */
>     protected volatile transient int accessCount = 0;
> 
> Explanation:
> The expression accessCount++ is not thread-safe because each thread may hold a
> local copy of the variable which does not have to be synchronized with main
> memory immediately.
> The volatile keyword forces this synchronization.
> 
> See the discussion on 
> http://forum.java.sun.com/thread.jspa?threadID=604831&start=30&tstart=0
> 
> [quote]
> 2) Local vs. main mem: Threads have--or can have--their own local copies of
> shared variables. When T1 writes a value to a variable, if the variable is not
> volatile and there's no syncing, that new value may live only in T1's local copy
> and may never get written to main mem, which means other threads may never see
> that new value. Entering and leaving sync blocks forces a reconciliation between
> the thread's local memory and main mem. Declaring a variable volatile forces
> every read and write of that variable to go to main mem.
> [end quote]
> 
> Another solution would be to synchronize the code segments.
> 
> I have to apologize for my claim that ++ and -- were atomic for int datatypes. 
> This seems to be only true for single-CPU systems (without hyperThreading).

Volatile may not be the best idea.  From the article:
http://cephas.net/blog/2003/02/17/using_the_volatile_keyword_in_java.html

"Careful, volatile is ignored or at least not implemented properly on many
common JVM's, including (last time I checked) Sun's JVM 1.3.1 for Windows. There
was an article on DDJ that could demonstrate this even on single-processor
machines..."

And from the article: 
http://www.javaperformancetuning.com/tips/volatile.shtml

"Note however that volatile has been incompletely implemented in most JVMs.
Using volatile may not help to achieve the results you desire (yes this is a JVM
bug, but its been low priority until recently). "


Our environment:
We are only seeing this issue for sure on one machine, which is a 4 way (with
HT--8 virtual CPU's) box.  We see the issue pretty frequently occur there... 
maybe 20-100 times a day for a total of 2,000-4,000 total sessions.

Also, this environment is NOT 64-bit.  These are 32-bit HT Xeons.  They are
running the Sun 1.4.2_06 or the 1.4.2_08 JDK...  So we know this issue occurs on
32 bit systems as well.

Comment 29 Remy Maucherat 2006-03-10 16:23:16 UTC

(In reply to comment #28)
> Volatile may not be the best idea.  From the article:
> http://cephas.net/blog/2003/02/17/using_the_volatile_keyword_in_java.html

-1 for sync, and feel free to test volatile (I'm only -0 for it). So the only
option you have left is simple check to force expiration of sessions which are
way past the normal timeout (like, ten times).

Comment 30 Matthew Sgarlata 2006-03-10 16:45:57 UTC

Per the previous comment: "-1 for sync, and feel free to test volatile (I'm only
-0 for it). So the only option you have left is simple check to force expiration
of sessions which are way past the normal timeout (like, ten times).", isn't
this a violation of the Servlet specification?   The specification states "The
session-timeout element defines the default session timeout interval for all
sessions created in this web application. The specified timeout must be
expressed in a whole number of minutes.  If the timeout is 0 or less, the
container ensures the default behaviour of sessions is never to time out. If
this element is not specified, the container must set its default timeout
period."  So in other words, the session-timeout says how many minutes before
the session is expired.  To expire in more or less minute than that is a
violation of the specification, IMO.

Also, as a Tomcat user, let me tell you where our users are noticing and
complaining about this bug.  We have a Current User Activity hyperlink in the
application that lets the system administrator view which users are online and
how long they have been idle.  This lets them see if, for example, see if any
users are online so that they know whether they can safely bounce Tomcat.

Comment 31 Remy Maucherat 2006-03-10 17:06:27 UTC

(In reply to comment #30)

You know, I do not care about what you think. I am proposing this, and that's
the way it will be, I will never accept a fix based on syncs (of course, it's
OSS, so you are more than welcome to use a custom patch). BTW, this may not have
anything to do with accessCount thread safety anyway.

Comment 32 Matthew Sgarlata 2006-03-10 17:15:41 UTC

I'm not trying to say this should be fixed with syncs.  I'm just trying to let
you know that there are users out there using this feature and that we would
like a fix if possible :)  It doesn't matter to me how the fix is implemented.

Comment 33 Filip Hanik 2006-03-10 17:21:03 UTC

and it will be fixed if you can provide us with a test case to reproduce the
problem. That way we can fix the real problem, whether it requires sync or not
sync is not relevant until we have identified the scenario where it happens.

without a test case we could patch something different, leading to other
problems, and that is what Remy not so eloquently might be trying to explain to you.

Comment 34 Lothsahn 2006-03-10 17:36:55 UTC

(In reply to comment #33)
> and it will be fixed if you can provide us with a test case to reproduce the
> problem. That way we can fix the real problem, whether it requires sync or not
> sync is not relevant until we have identified the scenario where it happens.
> 
> without a test case we could patch something different, leading to other
> problems, and that is what Remy not so eloquently might be trying to explain
to you.
> 
> 

If we can get a reproducable test case, I will provide one to you.  As is, we
are running the exact same software in hundreds of locations and we have only
seen this at one site.  Attempts to reproduce it on nearly identical hardware
with a nearly identical configuration have been unsuccessful.  Our ability to
reproduce the exact 

Thanks for your help so far.  If possible, I will look into patching the system
with a volatile accessCount, but I don't think this is going to be possible.

Comment 35 Matthias Ernst 2006-03-11 12:10:18 UTC

Spare your time. Volatile is not going to save you. While "x++" looks somehow
atomic it simply isn't. Read as "x = x+1". Whether x is volatile or not simply
doesn't matter.

In order to do correct access counting, you need to synchronize or use the
atomics provided in Java 5. And anyone who opposes that should remove the
variable altogether as it doesn't do any good then. And should get some
consulting about his fear of synchronization.

Comment 36 Arnaud Masson 2006-03-12 14:24:28 UTC

Since this a severe bug for our customers (because multiple concurrent sessions
are not allowed for a single user ID), I use the following simple workaround: 

1) A servlet Filter mapped to "/*" manages 2 session attributes (*):
- the end date of the last service() call, which is more interesting than the
access date
- the count of running service() calls

2) An HttpSessionListener maintains a list of all valid sessions.
It has a timer that checks at a fixed rate if any session in this list has expired.
The inactivity duration is computed from the end date of the last service(), so
long running requests are correctly handled.
Sessions that have at least one running service() are not invalidated, they are
checked later.

(*) the order of attribute updates is important to avoid a race condition that
can occur if the timer checks the session list between the two calls to
session.setAttribute().

Comment 37 Paul Goldsmith 2006-04-06 17:42:56 UTC

As an aside it is possible to use atomics from Java 5 in lower versions over Java through the use of this open source library http://dcl.mathcs.emory.edu/util/backport-util-concurrent/

Comment 38 Yoav Shapira 2006-04-13 19:53:51 UTC

Alright, since a workaround has been suggested and the original poster has not
come up with a reproducible test case, I'm closing this item for now.  If the
original poster or anyone else comes up with a way to reproduce this, please
feel free to reopen this item, attach your new test case, and we will be glad to
look at it.

Comment 39 Lothsahn 2006-04-13 20:02:50 UTC

(In reply to comment #38)
> Alright, since a workaround has been suggested and the original poster has not
> come up with a reproducible test case, I'm closing this item for now.  If the
> original poster or anyone else comes up with a way to reproduce this, please
> feel free to reopen this item, attach your new test case, and we will be glad to
> look at it.

I respectfully disagree.  A number of people (myself included) have reproduced
this bug, quite repeatedly.  I have a very reproducable testcase onsite, I am
just unable to provide our whole application to a third party.  What we have
been unable to provide is a self-contained testcase to reproduce this issue.

I don't feel this is an issue that only one person has been dealing with--that
would justify closing it.  This issue has been seen on a number of environments,
using entirely different applications.

The workaround is to:
1) Make a container object to hold the tomcat session and track the inactivity
manually
2) Write a background thread to invalidate the session once it times out.

I view this as a very serious bug in tomcat, even if it's difficult to
reproduce.  I'm not blaming anyone or saying that it needs to get fixed--you
can't very well fix an issue you can't reproduce, but closing it when it still
exists doesn't seem like a good idea.

Comment 40 Eddie Wynn 2006-05-03 11:30:22 UTC

I am re-opening this bug as I do not believe that marking it as 'works for me -
 fixed' is satisfactory to any of the people who are suffering this bug.

If it was just one person suffering from the bug then this might be an 
acceptable reply but given that there are multiple people who have reported 
it - and some people who have managed to debug Tomcat and suggest what the 
problem might be - to just close it without proper investigation is 
unacceptable.

I am aware that I have not provided a reproducable test case but like many 
other contributors I only see this issue in a live well loaded environment - 
in my case in a clients production environment - I simply cannot send you my 
clients entire application to test with and cannot reprodcue it here on our 
development machines - but this does not mean that there is not a problem.

I look forward to some more in depth investigation and proposals for solving 
this problem. Thanks, Eddie

Comment 41 Kelly Davis 2006-06-07 18:08:39 UTC

We are seeing this issue in our production environment as well. We are running
5.5.15 on a dual xeon processor ibm blade. We have seen it for atleast the last
year and found the same problem occurring on 5.0.28.  We are running an
identical staging configuration which we are unable to reproduce, presumably
because it is only occurring under significant load. We discovered the problem
when we implemented a session monitoring utility for our app.

Comment 42 Kelly Davis 2006-06-08 21:16:14 UTC

(In reply to comment #38)
> Alright, since a workaround has been suggested and the original poster has not
> come up with a reproducible test case, I'm closing this item for now.  If the
> original poster or anyone else comes up with a way to reproduce this, please
> feel free to reopen this item, attach your new test case, and we will be glad to
> look at it.

It sounds to me like the reproducible test case is the problem. It is pretty
near impossible to provide the application, hardware configuration AND load
necessary to reproduce this on a running instance of tomcat. In lieu of this, I
propose a simple test case which reproduces in separate test code which closely
mimics the behavior of tomcat. This behavior is essentially multiple requests in
multiple threads accessing the int accessCount field in a session object. Each
thread will increment the accessCount and then decrement it when the request
completes. Here is some sample code to which basically does that:

import java.util.concurrent.atomic.AtomicInteger;

public class SharedIntTest
{
  public AtomicInteger execCount = new AtomicInteger();
  public AtomicInteger countAtomic = new AtomicInteger();
  public int count;
  
  public static void main( String[] args )
  {
    final SharedIntTest shared = new SharedIntTest();
    int maxThreads = 1000;
    
    while ( true )
    {
      //Initialize counters
      shared.countAtomic.set( 0 );
      shared.execCount.set( 0 );
      shared.count = 0;
      
      for ( int i = 0; i < maxThreads; i++ )
      {
        new Thread( new Runnable()
         {
              public void run()
              {  
                //increment counters
                shared.countAtomic.incrementAndGet();
                shared.count++;

                //decrement counters
                shared.countAtomic.decrementAndGet();
                --shared.count;
                
                //increment execution counter
                shared.execCount.incrementAndGet();
              }
            }
        ).start();
      }
      
      while ( shared.execCount.get() < maxThreads )
      {
        try
        {
          Thread.sleep( 10 );
        }
        catch( InterruptedException e )
        {
        }
      }
      
      if ( shared.count != 0 ) 
        System.out.println( "Count Error: " + shared.count + "/" +
shared.countAtomic.get() + " " + shared.execCount.get() );
    }
  }

}

The variable count basically represents the accessCount field in the session. In
the body of the main method there is a while loop which repeatedly runs the test
code. The test code initalizes all of the variables, and then launches a number
of threads, which represent the requests in tomcat which are all incrementing
and decrementing accessCount on a given session. All that the threads do is to
increment the count variable and then immediately decrement it, basically what
the request threads are doing in tomcat. The threads also increment a variable
called execCount which is used to determine the number of completed threads. In
the main thread, there is a loop which monitors the execCount to determine when
all of the threads are completed. To make things interesting I have put an
AtomicInteger counter in, countAtomic, to determine whether or not the Atomic
solution mentioned in comment #21 would resolve the issue.

When run on a single processor, this could runs continuously and never prints
any errors, as we would hope. When I run this code on a dual processor, within a
couple of minutes a test will run where the final result of the count variable
is not zero. The countAtomic variable always has a final result of zero.

In conclusion, I believe that Matthias Ernst in comment #35 is exactly right,
either a sync must be done or Atomics must be used. Otherwise the session
management code as it exists now will be very broken on multiprocessor machines.

Comment 43 Darryl Miles 2006-06-09 04:48:59 UTC

Created attachment 18432 [details]
Stress Test Framework

Here is complete working WAR application with source included that might be
useful to build a real testcase from (or any test cases that only appear under
stress).

There is a web-app side, which you can either modify my DefaultServlet or add a
new servlet to provide additional code to help trigger the problem.  It would
be worth file you building a simple interface into your Session Management
classes



There is a HTTP traffic generator based around commons-httpclient,
multi-threaded.  You can theoretically extract the WAR onto a number of client
machines and/or use the same machine thats running Tomcat.

Invokation suggestion:

$ jar -xvf TCBug37356.war
$ cd WEB-INF 
$ java -cp
classes:lib/commons-httpclient-3.0.1.jar:lib/commons-codec-1.3.jar:lib/commons-logging-1.0.4.jar:lib/log4j-1.2.13.jar
testpackage/TestClientTomcat


Places to edit:

testpackage/TestClientTomcat: Edit the defaults at the top of the class, for
URL, requests, simultaneous thread count.

TestClientTomcat#startNextThread(): just before the call to #sweepOnce() you
might need to modify requests, like every 1000 inject a HTTP request to poll
your Session manager (through the servlet) to look for problems and report.

TestClientRunnable#run(): ~around line 177~ Modify this to suit your HTTP
request setup and response parsing.


Maybe someone else could attach a cut down version of their session management
code for analying the state of TC from within.

Feel free to email me privately for any assistance with it.

Comment 44 Remy Maucherat 2006-06-09 06:32:40 UTC

(In reply to comment #42)
> In conclusion, I believe that Matthias Ernst in comment #35 is exactly right,
> either a sync must be done or Atomics must be used. Otherwise the session
> management code as it exists now will be very broken on multiprocessor machines.

Very funny. Adding two syncs part of every request just to provide support for a
really stupid feature is not acceptable. The actual options are:
- remove the "no expire while requests are in progress" feature by default, with
an option to use the feature with full syncs for the two people that actually
need it
- force expire of abnormally long sessions regardless of the accessCount value
(many times over the session expiration time, with a minimum amount of time of a
few hours)

Comment 45 Kelly Davis 2006-06-09 13:50:55 UTC

There seems to be an argument about two separate things. The bug is that session
management is broken. The poster wants it fixed and I think all the respondents
want it fixed. Plain and simple.

The second thing is that there is this feature to allow long requests with short
session expirations, and it happens to be the thing that doesn't work and is
breaking the session management. You have made it abundantly clear that you
thing the feature is utterly idiotic. That's nice, I am glad you can share your
emotions so freely, but as far as I am concerned, the feature is just breaking
something which should fundamentally just work and it needs to be fixed. 

The bottom line is, I don't think myself or anyone else on this thread wants to
add unnecessary syncs just to fix this problem. You keep responding as if that
is what everyone wants. What people want is a fix. If the "no expiration while
request is in progress" feature is an absolute must have, then you must sync. If
it is not, then kill the feature or make it optional. I am not fighting for this
feature. I just want session management to work!

(In reply to comment #44)
> (In reply to comment #42)
> > In conclusion, I believe that Matthias Ernst in comment #35 is exactly right,
> > either a sync must be done or Atomics must be used. Otherwise the session
> > management code as it exists now will be very broken on multiprocessor machines.
> 
> Very funny. Adding two syncs part of every request just to provide support for a
> really stupid feature is not acceptable. The actual options are:
> - remove the "no expire while requests are in progress" feature by default, with
> an option to use the feature with full syncs for the two people that actually
> need it
> - force expire of abnormally long sessions regardless of the accessCount value
> (many times over the session expiration time, with a minimum amount of time of a
> few hours)
>

Comment 46 James Cook 2006-06-09 14:24:27 UTC

(In reply to comment #44)

> Very funny. Adding two syncs part of every request just to provide support for a
> really stupid feature is not acceptable.

I don't know about calling it "really stupid".  Non-standard and not well
tested, certainly.  Plus, since Sun points at Tomcat as the reference
implementation of the Servlet spec, extensions like this should be eliminated as
a matter of course.  In my humble opinion, of course.  =)

> The actual options are:
> - remove the "no expire while requests are in progress" feature by default, with
> an option to use the feature with full syncs for the two people that actually
> need it
> - force expire of abnormally long sessions regardless of the accessCount value
> (many times over the session expiration time, with a minimum amount of time of a
> few hours)

Both of those options are also "very funny".  Please don't clutter up the code
with yet another config option.  Kill the feature and make folks who want it
resort to a Filter or a Valve or something.  The thought of adding code to look
at sessions with nonzero accessCounts and arbitrarily killing off sessions after
"many times over the expiration time" has passed ... yuck.  A bug to fix a bug,
now that's non acceptable.  =P

Comment 47 Lothsahn 2006-06-09 15:31:59 UTC

Remy:  

I see two possible options.

1) Remove the code to keep sessions alive if in the middle of downloading and
past their timeout.  While this is not standard--this very well may be needed in
many customer environments, and changing this behavior could break applications
that rely on this functionality.

2) Properly synchronize the broken code.

I realize that you object to the performance overhead of a couple synchronize
blocks.  I'd like to reference the following pages:
http://www-128.ibm.com/developerworks/java/library/j-jtp04223.html
http://www-128.ibm.com/developerworks/java/library/j-threads1.html

Both of these pages state that synchronization overhead is much less than most
developers believe, and that modern JVM's are highly optimized to provide fast
synchronization.  Furthermore, synchronization is a fixed overhead and barely
noticable unless tested with empty or mostly empty methods which are called
repeatedly.  Tomcat's request handling does not resemble anything close to an
empty method, and most likely the overhead of synchronization would be negligable.

The only way to know the overhead of adding two synchronized blocks for sure
would be to add the relevant code and do performance testing.  Perhaps if you
agreed to consider a proposed patch to properly synchronize based on performance
results, someone would be willing to give you the relevant fix.  As things
currently are, session expiration in Tomcat is broken on multi-processor
systems.  This is a SEVERE bug and worth a small amount of performance loss to fix.

Furthermore, the synchronization should be performed on the session rather than
globally.  The vast majority of all requests would never contend for the
synchronization lock in this case--further reducing the overhead even more. 
Only multiple requests on the same session would be serialized, and this would
most likely not limit scalability of Tomcat.

I personally would rather see the appropriate synchronization added and
performance testing done to validate that the overhead is minimal.  I believe
that this solution would be better than possibly creating a regression in
behavior for people who use Tomcat to perform long operations with a "short"
timeout (for instance: hosting very large files to users on slow connections).

Either way, the problem that needs to be fixed is session management.  Having a
broken session management system which does not properly invalidate connections
is unacceptable.  The workaround for terming very long term sessions also seems
like a very bad idea to me.

Comment 48 Remy Maucherat 2006-06-09 18:50:12 UTC

Reading and posting to bugzilla is really a waste of time. Oh well ... So I will
implement option 1.

Comment 49 Peter Fassev 2006-08-27 13:29:29 UTC

To Remy:

Please, don't do this, or provide an option to switch it on! Long running 
requests are common, like:

1) Downloading large files or items
2) Uploading large files

The session MUST be synchronized, please think about multiple processor 
systems, where parallel requests will be executed within different threads!

I can't see, why you are so ignorant, and why you are insisting on a 
real "stupid" (using our words) insynchronzed solution!

Peter

Comment 50 Remy Maucherat 2006-08-27 14:34:42 UTC

*** Bug 40305 has been marked as a duplicate of this bug. ***

Comment 51 Mark Thomas 2006-10-29 18:33:15 UTC

I have a simple test case that demonstrates this issue. I create two threads
that repeatedly call session.access() session.endAccess() on the same session
object. After running them for a five seconds, they are stopped and I manually
check session.accessCount. With 1 thread it is zero. With two, it is typically
in the low 1000s. This demonstrates conclusively that this is a thread-safety
issue with session.accessCount

Using volatile sends it through the roof to 100,000s. I am afraid, therefore,
that this is not a viable option.

Using syncs does fix it but slows the access() and endAccess() calls by several
orders of magnitude (60ns total to 5.6ms total). This would have a serious
performance impact. I am looking at alternatives. All suggestions welcome.

Comment 52 Christopher Schultz 2006-10-30 20:32:42 UTC

(In reply to comment #51)
> I have a simple test case that demonstrates this issue. I create two threads
> that repeatedly call session.access() session.endAccess() on the same session
> object. After running them for a five seconds, they are stopped and I manually
> check session.accessCount. With 1 thread it is zero. With two, it is typically
> in the low 1000s. This demonstrates conclusively that this is a thread-safety
> issue with session.accessCount

I'm unclear why there was a question in the first place. We have unsynchronized
access to a variable that depends on sane values in a multi-threaded environment.

Oddly enough, sessions in Tomcat (in general) appear to be impossible to make
threadsafe. Since StandardSession is wrapped in a StandardSessionFacade before
being exposed to servlet code, and the facade is not guaranteed to have a 1-to-1
mapping with the /real/ session, users cannot even synchronize on the "session"
object they get back from calling HttpServletRequest.getSession().

I realize that I have brought up a different issue entirely, but the entire
discussion above plus this fact lead me to the conclusion that the maintainers
of Tomcat place a higher value on speed than thread safety.

Which is worse... adding a few ns* to calls to a few methods in StandardSession,
or have sessions piling up on top of one another until you run out of memory?

(*) Note that in the test Mark describes, the object locks are under contention.
He doesn't mention if his timings (sync'd vs. unsync'd) were in his
single-thread or dual-thread runs. Uncontended locks are a lot faster than
contended locks, so I would expect many sessions to continue with less overhead
than those that are apparently getting many simultaneous accesses.

Comment 53 Mark Thomas 2006-10-30 21:03:28 UTC

For completeness, with two threads the average difference was 60ns to 50ms.
For a single thread it was 75ns to 225ns.

Bear in mind all of these figures are somewhat artificial. The real figure will
vary with webapp, hardware etc.

Comment 54 Mark Thomas 2006-11-04 17:12:22 UTC

This has been fixed in SVN and will be in 5.5.21 onwards.

In summary:
 - accessCount is disabled by default
 - when enabled, access to accessCount is synchronized

Comment 55 Pascal HERAUD 2007-02-22 07:40:22 UTC

Created attachment 19631 [details]
A JSP That can be used to see the accessCount of a session

For people having doubts and still investigating about this bug, I put in
attachment a small JSP that you can put at the root of your web application to
see if the bug is happening.
You can log into your application and call this JSP (e.g.
http://localhost:8080/myWebApp/testSession.jsp) that will display the
accessCount of your session. Just do some refresh into your application ( on a
real page ) and refresh the testSession.jsp.
If you have no request pending, the displayed accessCount should be (1), if
it's greater, you triggered the bug.
On some of my applications, the accessCount takes 10 per each page, probably
trigggered by numerous GET request made by the browser for getting
CSS/Javascript files (even in pessimistic cache mode).

Comment 56 Torr Liu 2008-01-21 01:51:20 UTC

if (Globals.STRICT_SERVLET_COMPLIANCE) {
            synchronized (lock) {
                accessCount++;
            }
        }

In 5.5.25, it add some synchronized code for the accessCount when ++ and -- 
cases, but did not add synchronized code for reading operation in isValid() 
and recycle() method.

public boolean isValid() {

        if (this.expiring) {
            return true;
        }

        if (!this.isValid ) {
            return false;
        }

        if (accessCount > 0) {
            return true;
        }

        if (maxInactiveInterval >= 0) { 
            long timeNow = System.currentTimeMillis();
            int timeIdle = (int) ((timeNow - thisAccessedTime) / 1000L);
            if (timeIdle >= maxInactiveInterval) {
                expire(true);
            }
        }

        return (this.isValid);
    }

In tomcat 6.x, it use the AutomicInteger to fix this issue. Please check!

Comment 57 Mark Thomas 2008-01-21 12:58:37 UTC

As a minimum, you need to provide an explanation of why this needs further work
and ideally a test case that demonstrates this issue. I don't see any immediate
issues with the reads not being synchronised.