-
Notifications
You must be signed in to change notification settings - Fork 142
FELIX-6778 Reduce number of re-creation(s) of SCR Component Registry thread #419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…l.Timer, with a ScheduledThreadPoolExecutor. This is configured to keep its core Thread(s) around for up to 10 seconds without any tasks. This enables a low 'ds.service.changecount.timeout' to be used, without having to pay the cost of a lot of short-lived threads being created.
Soundy interesting. |
In what area do you see the test execution struggling? I see one test failure (Felix6274Test), but that appears on main branch as well, and looks unrelated to this change. |
Without knowing much about this specific part of Felix and SCR in general, would java 21 virtual threads be of interest here? It would make the code more complex, as we want to support pre java 21 as well, but we have similar code in JettyService for Jetty 12. |
@paulrutter |
With reflection we could support both pre and post java 21 versions. felix-dev/http/jetty12/src/main/java/org/apache/felix/http/jetty/internal/JettyService.java Line 284 in 54fd91e
|
In case of SCR, I don't think its use of threads (SCR Component Registry, or SCR Component Actor) benefit particularly from specific properties of virtual threads, as I interpret Oracle's description of virtual threads:
These properties do really fit with something like Jetty, but for SCR I don't really see what it would bring. |
I was triggered by this part of your description: 'without having to pay the cost of a lot of short-lived threads being created'. |
Wondered how this should work. |
Technically you could even use MR-Jar here, but I really don't think that Virtual Threads give a benefit here and a simple Threadpool would be sufficient. |
In the catch clause, one would just use a regular |
from my perspective, Java 11 is definitely no problem, Java 17 should be ok but might already cause problems for some, Java 21 will cause problems for more. Given the bandwidth I would go for a complicated solution that leverages Java 21, but also works with lower versions |
I just wanted to mention that there is some hard work going on to keep the core OSGi framework at java 8, moving SCR to anything higher would probably render this useless for those consumers. |
I need SCR on Java 8 for the now. Do we have performance metrics that show an improvement here? Can someone explain what |
I don't have a specific performance metric I can refer to, but I did notice that with a tiny Another benefit is that the code becomes a bit simpler, as there is no need to cancel midway (nor the synchronisation that is currently present to ensure this happens consistently).
I presume this is the reason, but I am not familiar with the exact reason(s) for this design. |
I fully agree. I wanted to understand what the reason was behind the setting to help me understand if there is a more simple way. I really don't like the current usage of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks fine and I don't see anything that requires Java > 8 here
return new Thread(r, "SCR Component Registry"); | ||
} | ||
}); | ||
threadPoolExecutor.setKeepAliveTime(10, TimeUnit.SECONDS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
threadPoolExecutor.setKeepAliveTime(10, TimeUnit.SECONDS); | |
threadPoolExecutor.setKeepAliveTime(m_configuration.serviceChangecountTimeout()*2, TimeUnit.MILLISECONDS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you be open to maintaining a minimum keep alive of 500ms? The suggested change would, with a very short timeout of 10ms, cause the thread to be cleaned up almost as fast/often as with the current Timer approach.
Something along the lines of:
threadPoolExecutor.setKeepAliveTime(Math.max(500, m_configuration.serviceChangecountTimeout()*2), TimeUnit.MILLISECONDS);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You maybe want to make it configurable by a system property. Maybe even we do not need to ever timeout the thread anyways?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not having any timeout would further simply things (ie using one of the standard factory methods, over manually initializing one of the ScheduledExecutorService implementations). I opted for this, as wanted to stay as close to the current behaviour, and thus wanted to retain the cleanup of the thread, with one change that the thread is not cleaned-up immediately but after some period of time.
I can make that period Thread timeout configurable, similar to how the changecount timeout is configurable, and even include a way to never have a timeout, but with the latter as well, I also see this adding some complexity as well. For instance, should there be some kind of check on the Thread timeout vs the changecount timeout?
Coming back to your initial suggestion, would something like this work:
threadPoolExecutor.setKeepAliveTime(Math.max(m_configuration.componentRegistryThreadTimeout(), m_configuration.serviceChangecountTimeout()*2), TimeUnit.MILLISECONDS);
So having the thread timeout be 2x the changecount timeout, with a configurable minimum which is by default 500ms?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we really want to stop creating new threads I suggest we could combine this scheduled executor with the ComponentActorThread
into a single executor. I'm not sure why we have our own implementation of what looks like a simple executor with ComponentActorThread
. But this thread lives for the life of the active SCR bundle. We could just make this a single scheduled executor that we post work to for both the component actor thread and for scheduling this change count service property update. Then we don't have to worry about managing any keep alive or otherwise since this thread is always there.
Although we could also let this single thread terminate in the pool of one also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tjwatson I like the idea of moving the execution of the service.changecount property to the existing SCR Actor thread. I can update this PR if there are no objections to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds useful here. Especially as we can assume that the scheduled action do not really "block" anything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have pushed commit 181336b which moves the service.changecount update towards the SCR Component Actor. The realisation of the SCR Component Actor is now implemented using a ThreadPool (which offers the scheduling of runnables, which is needed to realise the timeout of the service.changecount update).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial peek at the changes looks good. I like the simplification. I still want to take a closer look to do a proper review.
One small concern is if the two task "types" may have some condition that they are waiting for which would be provided by the other task type. For example, the component actor task cannot complete until the change count task completes and the component actor task gets put into the queue of tasks before the change count task can get scheduled.
I'm not sure how realistic that scenario is, but it is something we are going to have to look out for.
Replaced the SCR Component Registry executor with the SCR Component Actor Added integration test that validates service.changecount property updates
@@ -776,9 +758,6 @@ public void run() | |||
} | |||
|
|||
public void shutdown() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this function is now empty, I assume it can be removed (or should other cleanup happen here still)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is my assumption also given the executor will shutdown and cleanup instead.
@@ -560,7 +562,7 @@ public <T> void leaveCreate(final ServiceReference<T> serviceReference) | |||
* @param serviceReference | |||
* @param actor | |||
*/ | |||
public synchronized <T> void missingServicePresent( final ServiceReference<T> serviceReference, ComponentActorThread actor ) | |||
public synchronized <T> void missingServicePresent( final ServiceReference<T> serviceReference, ScheduledExecutorService actor ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should stop passing the actor
here since it is the same object passed to the constructor and stored in m_componentActor
.
static | ||
{ | ||
descriptorFile = "/integration_test_simple_components.xml"; | ||
DS_SERVICE_CHANGECOUNT_TIMEOUT = 1000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this change the ComponentTestBase.DS_SERVICE_CHANGECOUNT_TIMEOUT for all other tests run after this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately it does. I replicated the pattern from DS_LOGLEVEL, but that suffers from the same issue; once a test sets it, all subsequent tests (which is not necessarily always consistent across builds) use the same settings, unless it gets explicitly (un)set in one of these subsequent tests.
Without a more significant refactoring effort, I don't really see a way from this specific test to set a specific system property in Pax Exam.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you reset it back in an @AfterClass
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you reset it back in an
@AfterClass
method?
It looks like the @AfterClass
is called within the forked JVM, and unfortunately does not affect the variable set in the static initialiser block (whom's value resides in the main unit process, and is input for the @Configuration
annotated method).
# Conflicts: # scr/src/test/java/org/apache/felix/scr/integration/ComponentTestBase.java
Removed m_componentActor parameter from ComponentRegistry.missingServicePresent()
You can rebase, than we have the logging fix in |
Replaced the pattern of creating and cancelling a java.util.Timer, with a ScheduledThreadPoolExecutor. This is configured to keep its core Thread(s) around for up to 10 seconds without any tasks. This enables a low 'ds.service.changecount.timeout' to be used, without having to pay the cost of a lot of short-lived threads being created.