Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel/mutex: Fix races, make unlock rescheduling #20919

Merged
merged 1 commit into from
Dec 3, 2019

Conversation

andyross
Copy link
Contributor

The k_mutex is a priority-inheriting mutex, so on unlock it's possible
that a thread's priority will be lowered. Make this a reschedule
point so that reasoning about thread priorities is easier (possibly at
the cost of performance): most users are going to expect that the
priority elevation stops at exactly the moment of unlock.

Note that this also reorders the code to fix what appear to be obvious
race conditions. After the call to z_ready_thread(), that thread may
be run (e.g. by an interrupt preemption or on another SMP core), yet
the return value and mutex weren't correctly set yet. The spinlock
was also prematurely released.

Fixes #20802

Signed-off-by: Andy Ross andrew.j.ross@intel.com

The k_mutex is a priority-inheriting mutex, so on unlock it's possible
that a thread's priority will be lowered.  Make this a reschedule
point so that reasoning about thread priorities is easier (possibly at
the cost of performance): most users are going to expect that the
priority elevation stops at exactly the moment of unlock.

Note that this also reorders the code to fix what appear to be obvious
race conditions.  After the call to z_ready_thread(), that thread may
be run (e.g. by an interrupt preemption or on another SMP core), yet
the return value and mutex weren't correctly set yet.  The spinlock
was also prematurely released.

Fixes zephyrproject-rtos#20802

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
@dleach02 dleach02 added this to the v2.2.0 milestone Nov 26, 2019
@dleach02
Copy link
Member

Need some more discussion and investigation but possible inclusion early into 2.2.

@pabigot
Copy link
Collaborator

pabigot commented Nov 26, 2019

As mentioned in triage: I'm concerned that this---as a resolution to #20802---implements a change to the documented behavior of cooperative threads, in that it proposes that releasing a mutex can take control away from a cooperative thread and give it to another thread with a higher priority.

A cooperative thread is supposed to release control only when it is made unready. Is releasing a mutex an operation that satisfies this condition? Is that documented anywhere?

I fear that this reduces the value of cooperative threads. In any case I think some documentation updates may be warranted.

(edit: Let me be clear: I understand the value of yielding to a higher priority thread when a mutex blocking it is released. I just don't think that behavior is allowed with the thread model as currently defined.)

@dleach02
Copy link
Member

@pabigot, I tend to agree with what you are saying. This, to me, falls under the category of "APIs that may yield control" and as you point out, this releasing of a mutex currently is not documented to release control in a cooperative thread environment. I suspect that it should allow a swap but that might reveal different problems when you change the underlying scheduler functionality.

@pabigot
Copy link
Collaborator

pabigot commented Nov 26, 2019

... falls under the category of "APIs that may yield control" ...

Thanks for using that term, which I hadn't seen before. It seems that the description of cooperative and preemptible threads is not entirely in sync with the description of cooperative and preempting context switches. Neither currently allows a swap on release of a mutex, and the context switching discussion more clearly disallows it:

Control is never taken from cooperative thread when one of them is the running thread.

@pabigot pabigot self-requested a review December 1, 2019 12:30
Copy link
Collaborator

@pabigot pabigot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thread behavior in Zephyr is documented in three locations which in aggregate are not entirely consistent:

The Scheduling page specifies that:

Once a cooperative thread becomes the current thread, it remains the current thread until it performs an action that makes it unready

(It subsequently weakens this description to allow superseding when the current thread invokes k_yield() which doesn't make the thread unready, but does explicitly cause the scheduler to select a new ready thread.)

The architecture page description specifically states:

A cooperative context switch happens when a thread willfully gives the control to another thread.

and

Control is never taken from cooperative thread when one of them is the running thread.

From what I can tell the change in this PR breaks the contract for cooperative threads because the current thread (which invoked the release) will be replaced by the new thread without having been made unready or yielding.

Note that #20802 has been downgraded to a question. If the answer to the question is that "yes, in this case a context switch should occur" then the description of cooperative thread behavior should be revised prior to merging this change.

(NB: I got here because of #18970 for which we need to know the conditions under which a thread may be swapped for another. This proposal and #20802 make those conditions unclear.)

@dleach02
Copy link
Member

dleach02 commented Dec 2, 2019

I think this opens up the discussion on cleaning up cooperative thread operation and documentation. My perspective is that any synchronization activity should be documented to "may" yield control. Where k_yield, k_mutex_unlock, and k_sem_give fall into this group (sema_give and yield already do this). This PR, if the operation is generally agreed on, should update the documentation appropriately to clearly identify this action as one of the actions that may yield control.

@pdunaj
Copy link
Collaborator

pdunaj commented Dec 2, 2019

Hi @pabigot , this would not be a problem if there would be a clear list of API functions that are expected to reschedule. For instance giving a semaphore will reschedule why not unlocking the mutex?

@pabigot
Copy link
Collaborator

pabigot commented Dec 2, 2019

I think this opens up the discussion on cleaning up cooperative thread operation and documentation. My perspective is that any synchronization activity should be documented to "may" yield control. Where k_yield, k_mutex_unlock, and k_sem_give fall into this group (sema_give and yield already do this).

Certainly cleaning up documentation. k_yield() does indicate its behavior. k_sem_give() is not documented to yield control, or if it is it's not done at the function declaration in the header. If it truly can yield control that's potentially a problem; a cursory inspection suggests it does not.

Until now cooperative threads have been most clearly described as not being swap-able until they are made unready (or invoke k_yield()). That means they don't have to lock to protect state from conflicting access.

How much existing code would break if this behavior changed and threads could lose control where they didn't before?

This PR, if the operation is generally agreed on, should update the documentation appropriately to clearly identify this action as one of the actions that may yield control.

Absolutely. My point is that is a big API change that must be discussed.

@pabigot
Copy link
Collaborator

pabigot commented Dec 2, 2019

Hi @pabigot , this would not be a problem if there would be a clear list of API functions that are expected to reschedule. For instance giving a semaphore will reschedule why not unlocking the mutex?

@pdunaj Why do you believe giving a semaphore will reschedule? I don't see it in documentation; it doesn't seem to from the thread (tested); and the documentation clearly indicates it shouldn't from an ISR.

@andyross
Copy link
Contributor Author

andyross commented Dec 2, 2019

We seem to be paralyzed again. @pabigot: I think a documentation-level audit and specification pass that documents reschedule points is definitely needed. Can we get you on board as a volunteer?

As far as this patch goes: is there clarity on the desired behavior? My first opinion was that it wasn't needed, consensus seemed to be that it was, now maybe things are swinging the other way. Should I just close this?

@nashif
Copy link
Member

nashif commented Dec 2, 2019

Actually, contrary to what @andyross says here: #20802 (comment) k_sem_give does have a call to z_reschedule(). @andyross ?

@pabigot
Copy link
Collaborator

pabigot commented Dec 2, 2019

We seem to be paralyzed again. @pabigot: I think a documentation-level audit and specification pass that documents reschedule points is definitely needed. Can we get you on board as a volunteer?

To a point, yes. From what I can tell the material in the porting architecture section was mostly superseded by the material in the thread and scheduling sections (e.g. architecture still references _Swap which is long gone). I'm not comfortable with reworking that material.

But certainly in support of #18970 I can make an attempt at marking functions that can cause a context switch. Right now "reschedule point" is not a defined term, so it will need to be clarified and added to #20959. In support of the overall effort I could use some assistance in understanding possible call contexts and filling out the "Context Terminology" section.

As far as this patch goes: is there clarity on the desired behavior? My first opinion was that it wasn't needed, consensus seemed to be that it was, now maybe things are swinging the other way. Should I just close this?

I think we need to come to a conclusion in #20802 whether it was a bug as originally proposed, or a question to which the answer is "no, it doesn't work that way", or something else.

@andyross
Copy link
Contributor Author

andyross commented Dec 2, 2019

Yeah, I was just wrong about k_sem_give() it looks like. The broader point is that we do indeed mix paradigms across otherwise similar APIs.

@andyross
Copy link
Contributor Author

andyross commented Dec 2, 2019

And @pabigot: we still need an answer on this patch, even pending doc rework. Yes or no? We should be able to make a decision on the right behavior here without attempting to get everything right simultaneously.

@pabigot
Copy link
Collaborator

pabigot commented Dec 2, 2019

Actually, contrary to what @andyross says here: #20802 (comment) k_sem_give does have a call to z_reschedule().

OK, I see that now.

Can somebody explain to me why that isn't an error, given the documented behavior clearly indicates it should not happen?

I can't do anything when implementation is inconsistent with the thread architecture description.

I'm no on this patch until the expected behavior is well-defined and consistent with this. (Also, why didn't this cause a failure in a test that verified the cooperative thread context switch behavior was intended?)

At a higher level I'm no on a change that allows cooperative threads to be switched out when they did not make themselves unready, and I'm deeply disappointed that there's existing behavior inconsistent with that straightforward behavioral description.

@andyross
Copy link
Contributor Author

andyross commented Dec 2, 2019

Opened #21111 to track the doc work that blocks changes to this behavior.

@andyross andyross added the Blocked Blocked by another PR or issue label Dec 2, 2019
@nashif
Copy link
Member

nashif commented Dec 2, 2019

Can somebody explain to me why that isn't an error, given the documented behavior clearly indicates it should not happen?

Looks like a bug, the reschedule was always there for semaphores, however we used to check if the calling thread was preemptible and that is not there any more (bug no. 1).

Bug no. 2 is that it does not do that at all for mutexes, so we need to add the reschedule for mutexes but check if we are preemtible first.

Bug no. 3 are other objects, but if we fix reschedule in bug no. 2 it should be fixed for others as well, as long as the do call reschedule().

@andrewboie
Copy link
Contributor

@pabigot can we at least agree that k_mutex_unlock() isn't correct for preemptible threads?

Right now, if I understand the code correctly, if a preemptive thread releases a k_mutex that was waited on by a higher priority thread, it will continue to run afterwards for some indeterminate time until some other event causes a reschedule.

This is horrible, it screws up latency because the higher priority thread is not scheduled in as soon as it should be, it pretty much breaks the point of having priority-inheriting mutexes. I feel this is a bug must be fixed. And that goes for all of our IPC object APIs, they must behave in this way for preemptible threads.

I have no objections for not invoking a reschedule when the k_mutex_unlock() caller is cooperative (or indeed any IPC object). If we want to leave it how it is, fine, let's be sure that it's documented.

@nashif
Copy link
Member

nashif commented Dec 2, 2019

I'm deeply disappointed that there's existing behavior inconsistent with that straightforward behavioral description.

@pabigot Sadly, the engineers who designed/wrote the current set of kernel APIs were from Wind River and left the project some time ago, so it's hard to find out intentions, but my feeling that this k_sem thing is just a bug that didn't get caught in code review.

correct, this is just a bug that went through and was unnoticed for over a 1.5 years... I think the documentation as we have it is correct already. the fixes should be straightforward, we have all the logic afaik.

@pabigot
Copy link
Collaborator

pabigot commented Dec 2, 2019

For whatever it's worth the test I did was https://github.com/pabigot/zephyr/commits/nomerge/coop-example. If k_sem_give() caused a reschedule I would expect it to fail, but it doesn't. So something's just weird.

@andrewboie
Copy link
Contributor

Actually @nashif @pabigot we need to hold on a second here. Can someone run by me again why we have singled out k_sem_give()??

I don't think this is limited to k_sem. I just went through and looked at the kernel code. It appears that All of our IPC which does a non-blocking wakeup of other threads, except k_mutex(), is calling z_reschedule() without considering whether the caller is cooperative.

Unless I'm missing something, the kernel code is consistent with the exception of k_mutex, and this patch, as-is, makes the API behave like everything else.

Object type Co-op reschedule behavior Preempt reschedule behavior
k_sem_give yes yes
k_futex_wake yes yes
k_msgq_put yes yes
k_mutex_unlock no no
k_queue_append/prepend/insert/cancel_wait yes yes
k_mem_pool_free/free_id yes yes
k_mem_slab_free yes yes
k_thread_start yes yes
k_stack_push yes yes
k_thread_abort yes yes
k_mbox_put (async enabled) yes yes
k_pipe_put (async enabled) yes yes

If you want to contend that the documentation of cooperative thread semantics isn't correct and that our IPC APIs need some clarity, sure, but at least in my view it should not block this patch.

@pabigot
Copy link
Collaborator

pabigot commented Dec 2, 2019

@andrewboie I singled out k_sem_give() because commentary in #20802 stated it caused a context switch from a low-priority cooperative thread to a higher-priority cooperative thread while k_mutex_unlock() did not.

I've failed to independently confirm that it does, but am deferring to the analysis here that says it does.

If there's evidence that this PR does not make k_mutex_unlock() context-switch away from a lower-priority cooperative thread that invokes it, and we're all agreed that it shouldn't, then I'll remove my block, we can close #20802 as "works as intended", and merge this for its fix of race conditions.

A really good way to provide that evidence is to add a test that demonstrates it.

@dleach02
Copy link
Member

dleach02 commented Dec 2, 2019

@andrewboie FreeRTOS is one example where the semaphore/mutex will yield if the task waiting on the item is higher priority. I think mbedOS does the same but the code is a little harder to follow.

@pabigot I understand the documentation is at odds with implementation and this PR. I was wondering if there was some other reason you felt this was wrong because documentation can be modified to align with expectations or the code base can be modified to align with documentation.

@andrewboie
Copy link
Contributor

andrewboie commented Dec 2, 2019

@andrewboie I singled out k_sem_give() because commentary in #20802 stated it caused a context switch from a low-priority cooperative thread to a higher-priority cooperative thread while k_mutex_unlock() did not.

I've failed to independently confirm that it does, but am deferring to the analysis here that says it does.

If there's evidence that this PR does not make k_mutex_unlock() context-switch away from a lower-priority cooperative thread that invokes it, and we're all agreed that it shouldn't, then I'll remove my block, we can close #20802 as "works as intended", and merge this for its fix of race conditions.

A really good way to provide that evidence is to add a test that demonstrates it.

@pabigot I don't think you understand my point.

EVERY kernel IPC API which does a non-blocking wakeup of other threads has a reschedule point. The only omission was k_mutex_unlock(), and this was not anyone's design for this to be a special case, it's just a bug.

You seem to be arguing that because k_mutex_unlock() doesn't currently do this, that this bugged behavior needs to be preserved for co-op threads, and you seem to be basing this solely on the documentation for what co-operative threads are, and not because k_mutex was intended to be this way.

I do not find your argument compelling. We need to just fix k_mutex to work like all of our other IPC instead of adding a special case (merge this patch as-is), and if we want a separate discussion on whether IPC should yield the CPU for co-op threads that's another discussion.

@nashif
Copy link
Member

nashif commented Dec 2, 2019

@andrewboie FreeRTOS is one example where the semaphore/mutex will yield if the task waiting on the item is higher priority. I think mbedOS does the same but the code is a little harder to follow.

actually freertos uses queueYIELD_IF_USING_PREEMPTION() in this case :)

@pabigot
Copy link
Collaborator

pabigot commented Dec 2, 2019

This PR is described as "fixing" (which I interpret as "change behavior to match the stated expectations of") #20802 where the complaint was:

I have threads. Two are cooperative. The higher priority one is waiting for mutex. The lower priority one is holding the mutex. After mutex is release low priority one continues execution until sleep.
[...]
Expected behavior
High priority thread should be swapped in as soon as lock is released. i.e. mutex_unlock must be a scheduling point!

This behavior should be expected for pre-emptible threads, but not for cooperative threads. #20802 is asking for a behavior change inconsistent with documented long-term behavior of Zephyr.

However: Regardless of comments in #20802 and here, which I believed, I see no evidence that cooperative threads ever lose control after making higher-priority cooperative threads ready. Which means the behavior is as it's documented to be, and I don't see evidence this PR would change that.

I'm removing my block, but would be nice if this was updated to not claim to fix #20802 which should instead be closed as "works as designed".

If somebody does want cooperative threads to be preemptible, we can argue about that elsewhere.

@pabigot pabigot removed the Blocked Blocked by another PR or issue label Dec 2, 2019
@andyross
Copy link
Contributor Author

andyross commented Dec 2, 2019

For reference, 2 commits from the past...

Actually the specific mistake seems to have happened in commit 3a0cb2d: I was removing code in the arch layers that was needlessly interpreting "cooperative", but hit the spot in _reschedule() too, and that needed to stay. Prior to that commit a cooperative thread would never swap underneath _reschedule(), after that the preemptibility was ignored.

@andrewboie
Copy link
Contributor

andrewboie commented Dec 2, 2019

[edit: my understanding was incorrect and the kernel does the right thing with z_reschedule() calls if the caller is co-op]

@andrewboie
Copy link
Contributor

andrewboie commented Dec 2, 2019

@pabigot @andyross @nashif I might be slow in catching up to something -- do we believe that z_reschedule() already does the right thing if called from co-op context? If so then my original table is incorrect and the true policy as it currently exists is:

Object type Co-op reschedule behavior Preempt reschedule behavior
k_sem_give no yes
k_futex_wake no yes
k_msgq_put no yes
k_mutex_unlock no no
k_queue_append/prepend/insert/cancel_wait no yes
... ... ...

In which case this patch is probably OK...but we need to verify that if this check exists, it is using the thread's original base priority, I have my doubts that it does.

I'm removing my block, but would be nice if this was updated to not claim to fix #20802 which should instead be closed as "works as designed".

I'd rather we kept it open because it's still busted for preempt threads, no matter what.

@andyross
Copy link
Contributor Author

andyross commented Dec 2, 2019

do we believe that z_reschedule() already does the right thing if called from co-op context?

It does. Ignore my comment above, the code that was removed was correctly removed, I just fooled myself based on the discussion here.

z_reschedule() will not swap away from a coop thread except to a metairq of higher priority.

@andrewboie
Copy link
Contributor

In which case this patch is probably OK...but we need to verify that if this check exists, it is using the thread's original base priority, I have my doubts that it does.

I talked to @andyross offline and it looks like this actually does the right thing. So we should be good.

@nashif
Copy link
Member

nashif commented Dec 3, 2019

I talked to @andyross offline and it looks like this actually does the right thing. So we should be good.

yes, and we have a test to catch this if it was a bug, looking at the test, it seems to be doing the right thing.

@nashif nashif modified the milestones: v2.2.0, v2.1.0 Dec 3, 2019
@nashif
Copy link
Member

nashif commented Dec 3, 2019

Changing milestone back to 2.1, @dleach02 lets talk about this tomorrow.

@pdunaj
Copy link
Collaborator

pdunaj commented Dec 3, 2019

Thank you all for checking this.
I was genuinely surprised not to see the reschedule in mutex. I admit I have not double checked the actual behavior of sem give but only reviewed the code.

Why I was expecting sem give to cause reschedule? My understanding was that by calling sem give / mutex unlock, code is passing the execution to another thread that is waiting for resource. Since this call is explicit I assumed it is a scheduling point that would work the same way as yield.

I think it sums up to what one expect to be the function that yields. I checked what Linux kernel does for PREEMPT_NONE and PREEMPT_VOLUNTARY. For both it does not reschedule when sem give is called. It seems that I spent too much time working with preemptive kernel...

Going back to Zephyr's documentation it is almost clear on this. There are following items I would add to clarify the subject:

A cooperative thread has a negative priority value. Once it becomes the current thread, a cooperative thread remains the current thread until it performs an action that makes it unready.

The meaning is very clear but there is no information about which API calls can move the thread to unready state. I think it would be good to add a link here pointing to where it can be found.

a cooperative thread can voluntarily relinquish the CPU from time to time to permit other threads to execute. A thread can relinquish the CPU in two ways: calling k_sleep()..., calling k_yield()...

This is list is not complete as any API that may result in thread sleeping can also do it. This may be the good place to put the @andrewboie list.

@andrewboie
Copy link
Contributor

andrewboie commented Dec 3, 2019

This is list is not complete as any API that may result in thread sleeping can also do it. This may be the good place to put the @andrewboie list.

@pdunaj I think you may be misunderstanding us. None of these APIs reschedule if the caller is co-op, the implementation of this policy is within z_reschedule() -- it's a no-op if the caller is co-op. We don't need to change the documentation. For callers from co-operative threads, k_mutex_unlock() is just like other IPC APIs -- none of them reschedule after waking other threads.

The reason why this patch is going in is that for preemptible threads, the reschedule needs to happen, and a call to z_reschedule() was missing completely from the k_mutex_unlock() implementation. So there was a bug, but not quite what you were reporting.

For your bug #20802 this is not a fix for you, this only a fix if the caller is a preempt thread. Your coop threads need to call k_yield() if they want to give up the CPU immediately after unlocking a mutex.

@pdunaj
Copy link
Collaborator

pdunaj commented Dec 3, 2019

@andrewboie Thanks.

This is list is not complete as any API that may result in thread sleeping can also do it. This may be the good place to put the @andrewboie list.

@pdunaj I think you may be misunderstanding us. None of these APIs reschedule if the caller is co-op, the implementation of this policy is within z_reschedule() -- it's a no-op if the caller is co-op.

It is possible. But surely k_mutex_lock(&lock, K_FOREVER) (and similar calls) reschedule and so the list pointing only to k_sleep and k_yield is incomplete.
I admit I misunderstood how cooperative threads in Zephyr give back control. I checked Linux kernel sources and documentation of other OSes to see how it is done there. It's clear for me now.

I spent most of my career working with preemptive threads. In kernel and user space. For me non-preemptive threads were seen as the ones giving control over CPU at rescheduling points. I actually never bother to look for explicit list as it was never needed. When I read in documentation of threading that cooperative thread gives CPU when becoming unready I understood it as "in rescheduling point". Since list mentioned above is not complete (e.g. mutex_lock, sem_take, etc. are not there) I expanded it in my head to include such cases. And simply included too much ;)

I am writing it as other people may have the same problem and wrongly assume that there is a rescheduling point where in fact there is none.

We don't need to change the documentation. For callers from co-operative threads, k_mutex_unlock() is just like other IPC APIs -- none of them reschedule after waking other threads.

You don't need to. Still it would be great to include explicit lists of rescheduling points. And point to it in documentation of threads. Especially that it will be short.

The reason why this patch is going in is that for preemptible threads, the reschedule needs to happen, and a call to z_reschedule() was missing completely from the k_mutex_unlock() implementation. So there was a bug, but not quite what you were reporting.

That I understood from the @andyross comments in this PR. Thanks anyway :)

@pabigot
Copy link
Collaborator

pabigot commented Dec 3, 2019

@pdunaj and others: Please look at #20959 and specifically the formatted document accessible from the PR description. The intent is to clearly define terms that can be used in API descriptions of behavior. In its current form API that can cause a cooperative thread to be made non-Running is "suspendable". (The term may not be perfect, but it has to be balanced among many other terms.)

This is intended to summarize, not replace, existing documentation on thread and scheduling behavior, though some changes to that existing documentation may be necessary to resolve inconsistencies and clarify (e.g. "made unready" is implicitly defined, but should be explicit).

From discussion here I've learned that I will need to add "reschedule-point" and "meta-irq" terms, and clarify the behavior of several generalized state changes related to the three (not two) classes of thread behavior. That should happen this week.

Ultimately these tags are expected to be used as in #21061 which IMO is a better approach than putting a table in the documentation.

@nashif
Copy link
Member

nashif commented Dec 3, 2019

It is possible. But surely k_mutex_lock(&lock, K_FOREVER) (and similar calls) reschedule and so the list pointing only to k_sleep and k_yield is incomplete.

There is a difference between k_yield() and k_mutex_lock(&lock, K_FOREVER), the first is voluntary , the second is asking for a resource that might or might not be available, k_mutex_lock does not always mean a reschedule point.

@nashif
Copy link
Member

nashif commented Dec 3, 2019

@pdunaj and others: Please look at #20959 and specifically the formatted document

@pabigot As I said earlier, the intent of this document is really confusing and I would not make it a point of reference. If you have any proposals to changes in the documentation, please submit those to the existing documentation that we have already, duplicating things in different places is not going to help, there should always be 1 single source of truth.

It is ok to have some terminology page where things are briefly explained, but the above is becoming documentation on its own already.

@pabigot
Copy link
Collaborator

pabigot commented Dec 3, 2019

There is a difference between k_yield() and k_mutex_lock(&lock, K_FOREVER), the first is voluntary , the second is asking for a resource that might or might not be available, k_mutex_lock does not always mean a reschedule point.

We're going off topic, but: I think it does, because it invokes reschedule, which is a clear necessary and sufficient condition for being a reschedule point.

However, whether reschedule resumes the current thread or picks a new thread (context-switches) depends on the class (pre-emptible, cooperative, meta-irq) relations of the current thread and the head of the ready queue. It's that matrix that needs to be clearly filled out based on other API characteristics.

It is ok to have some terminology page where things are briefly explained, but the above is becoming documentation on its own already.

As I noted It's intended to be self contained for now, but once we understand things the existing documentation will be updated to address gaps and the new material specific #18970 refactored into something that integrates with it. I don't see how to get to consensus on terms without having a place where they can be discussed together without having to look at three separate descriptions of thread and scheduler behavior.

@andrewboie
Copy link
Contributor

But surely k_mutex_lock(&lock, K_FOREVER) (and similar calls) reschedule and so the list pointing only to k_sleep and k_yield is incomplete.

Obviously k_mutex_lock reschedules if the lock is contended, it says that the calling thread waits right in the documentation for k_mutex_lock.

Your bug report was about expecting a reschedule after k_mutex_unlock...this is getting confusing

@dleach02 dleach02 merged commit 7022000 into zephyrproject-rtos:master Dec 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

reschedule not done after mutex unlock
7 participants