Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If tx_semaphore_get() returns TX_NO_INSTANCE, the immediate call to tx_thread_sleep() also returns TX_NO_INSTANCE after the sleep duration even though the sleep is called from thread context. #433

Open
S-Vishnu-Nair opened this issue Jan 28, 2025 · 4 comments
Labels
bug Something isn't working hardware New hardware or architecture support request

Comments

@S-Vishnu-Nair
Copy link

S-Vishnu-Nair commented Jan 28, 2025

Describe the bug
When we call the tx_semaphore_get() API with wait_option value other than TX_WAIT_FOREVER and TX_NO_WAIT, if no instances of the specified counting semaphore is available, the API will fail with TX_NO_INSTANCE once the wait time has expired.

At this point, if we call the tx_thread_sleep() API it also would fail with the error code - TX_NO_INSTANCE.

Working scenarios:

  1. If we pass TX_WAIT_FOREVER as the wait_option argument to the tx_semaphore_get() API, once an instance of the semaphore is available, the API will return with TX_SUCCESS. Now if we call the tx_thread_sleep() API, the API also will return TX_SUCCESS.
  2. Similarly, if we pass TX_NO_WAIT as the wait_option argument to the tx_semaphore_get() API, it will return immediately with TX_NO_INSTANCE if semaphore instance is not availale. Now if we call the tx_thread_sleep() API, the API will return TX_SUCCESS.

Psuedo code to reproduce the issue

void test_thread()
{
    while (1) {
        /* do something */

        rc_loc = tx_semaphore_get(sem_ptr, TX_WAIT_FOREVER); 
        printf("ret = %x", rc_loc);                             /* Prints - "ret = 0" => TX_SUCCESS - once semaphore is available */
        rc_loc = tx_thread_sleep(100);
        printf("ret = %x", rc_loc);                             /* Prints - "ret = 0" => TX_SUCCESS - after sleeping for 100 ticks*/

        rc_loc = tx_semaphore_get(sem_ptr, TX_NO_WAIT); 
        printf("ret = %x", rc_loc);                             /* Prints - "ret = D" => TX_NO_INSTANCE - immediatedly when there is no semaphore */
        rc_loc = tx_thread_sleep(100);
        printf("ret = %x", rc_loc);                             /* Prints - "ret = 0" => TX_SUCCESS - after sleeping for 100 ticks*/

        rc_loc = tx_semaphore_get(sem_ptr, 10); 
        printf("ret = %x", rc_loc);                             /* Prints - "ret = D" => TX_NO_INSTANCE - after waiting for 10 ticks */
        rc_loc = tx_thread_sleep(100);
        printf("ret = %x", rc_loc);                             /* Prints - "ret = D" => TX_NO_INSTANCE - after sleeping for 100 ticks*/

        /* do something */
    }
}

Environment and tool:

  • Target - MIPS Interaptiv SMP CPU
  • Eclipse ThreadX version - threadx-6.4.1_rel
  • Compiler - GHS Multi compiler

Debugging observation:

  1. Whenever the sleep failed, we observed that the value of thread_ptr->tx_thread_suspend_status was TX_NO_INSTANCE before the _tx_thread_sleep() is called.
  2. In our build, TX_NOT_INTERRUPTABLE is defined. In that case we could see from the function definition of _tx_thread_sleep() that thread_ptr->tx_thread_suspend_status is not being initialized to TX_SUCCESS before calling _tx_thread_system_ni_suspend() but in the case where TX_NOT_INTERRUPTABLE is not defined, thread_ptr->tx_thread_suspend_status is initialized to TX_SUCCESS before calling _tx_thread_system_suspend(). Is there a possibilty that the issue that we are facing is due this?
  3. Please note that we have this issue only when the thread is put to suspended state because of the unavailability of the semaphore (when we specify a valid wait time). If the wait_option is TX_NO_WAIT, even if semaphore is not available, the thread is not put to sleep and the thread_ptr->tx_thread_suspend_status is not updated by the call to tx_semaphore_get().

The relevant porition of the _tx_thread_sleep() function definition is pasted below.

            /* Set the state to suspended.  */
            thread_ptr -> tx_thread_state =    TX_SLEEP;

#ifdef TX_NOT_INTERRUPTABLE

            /* Call actual non-interruptable thread suspension routine.  */
            _tx_thread_system_ni_suspend(thread_ptr, timer_ticks);

            /* Restore interrupts.  */
            TX_RESTORE
#else

            /* Set the suspending flag. */
            thread_ptr -> tx_thread_suspending =  TX_TRUE;

            /* Initialize the status to successful.  */
            thread_ptr -> tx_thread_suspend_status =  TX_SUCCESS;

            /* Setup the timeout period.  */
            thread_ptr -> tx_thread_timer.tx_timer_internal_remaining_ticks =  timer_ticks;

            /* Temporarily disable preemption.  */
            _tx_thread_preempt_disable++;

            /* Restore interrupts.  */
            TX_RESTORE

            /* Call actual thread suspension routine.  */
            _tx_thread_system_suspend(thread_ptr);
#endif

            /* Return status to the caller.  */
            status =  thread_ptr -> tx_thread_suspend_status;
        }
    }

    /* Return completion status.  */
    return(status);
}

Temporary workaround
We have found a temporary workaround for the time being. If we initialize the thread_ptr->tx_thread_suspend_status to TX_SUCCESS and then call _tx_thread_sleep(), the API doesn't fail.

void test_thread()
{
    while (1) {
        /* do something */

        rc_loc = tx_semaphore_get(sem_ptr, 10); 
        printf("ret = %x", rc_loc);                             /* Prints - "ret = D" => TX_NO_INSTANCE - after waiting for 10 ticks */
        thread_ptr->tx_thread_suspend_status = TX_SUCCESS;
        printf("ret = %x", rc_loc);                             /* Prints - "ret = 0" => TX_SUCCESS - */
        rc_loc = tx_thread_sleep(100);
        printf("ret = %x", rc_loc);                             /* Prints - "ret = 0" => TX_SUCCESS - after sleeping for 100 ticks*/

        /* do something */
    }
}

Expected behavior
We expect the sleep to not fail if the prevoius call to tx_semaphore_get() is timed out.

Impact
The issue is a show stopper. We have used the sleep API in many places. Although the issue seems to have stopped with the temporary workaround we can't mark the bug as fixed until we find the root cause and get the actual bugfix.

@S-Vishnu-Nair S-Vishnu-Nair added bug Something isn't working hardware New hardware or architecture support request labels Jan 28, 2025
@fdesbiens
Copy link
Contributor

Thank you for submitting this issue, @S-Vishnu-Nair.

@eclipse-threadx/iot-threadx-committers and @eclipse-threadx/iot-threadx-contributors: Please review.

@billlamiework
Copy link

Yes, this looks like a valid issue, @S-Vishnu-Nair. One short-term option is to not use the TX_NOT_INTERRUPTABLE option. The second option is to initialize the thread_ptr -> tx_thread_suspend_status to TX_SUCCESS before the call to _tx_thread_system_ni_suspend in tx_thread_sleep.c.

@S-Vishnu-Nair
Copy link
Author

Hi @billlamiework,

Thank you. We are planning to proceed with the work around of initializing the thread_ptr -> tx_thread_suspend_status to TX_SUCCESS. Will this be fixed in the upcoming releases? When can we expect the fixed release?

Regards,
Vishnu

@fdesbiens
Copy link
Contributor

@S-Vishnu-Nair We need to take a look at the effort required. I cannot commit to a timeline right now.

We will soon have a public roadmap that will make it easier to track such issues for everyone.

I will get back to you once I have discussed this with Bill and the rest of the team.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working hardware New hardware or architecture support request
Projects
None yet
Development

No branches or pull requests

3 participants