Locks on the P2
RossH
Posts: 5,502
in Propeller 2
Hello all
I'm having some trouble using locks on the P2. They seem to operate quite differently to the locks on the P1.
I was aware that the result of the P2 "locktry" instruction (i.e. the carry flag) is apparently the reverse of the P1 "lockset" instruction, but the differences seem to go deeper than that.
On the P1, only one "lockset" instruction - executed on any cog - would return the result that the lock had been acquired. But on the P2 the "locktry" operation seems to return that result every time it is executed on the same cog once the lock is acquired - i.e. the lock seems to belong to the whole cog, rather than to any particular program executing on that cog.
Can anyone confirm that this is correct?
Thanks!
I'm having some trouble using locks on the P2. They seem to operate quite differently to the locks on the P1.
I was aware that the result of the P2 "locktry" instruction (i.e. the carry flag) is apparently the reverse of the P1 "lockset" instruction, but the differences seem to go deeper than that.
On the P1, only one "lockset" instruction - executed on any cog - would return the result that the lock had been acquired. But on the P2 the "locktry" operation seems to return that result every time it is executed on the same cog once the lock is acquired - i.e. the lock seems to belong to the whole cog, rather than to any particular program executing on that cog.
Can anyone confirm that this is correct?
Thanks!
Comments
https://docs.google.com/document/d/1UnelI6fpVPHFISQ9vpLzOVa8oUghxpI6UpkXVsYgBEQ/edit?usp=sharing
I changed the way they worked to make them more robust for managing debugging. I can't remember the details of the "whys" at the moment.
Yes, I read that - it seemed to confirm what I am seeing in practice - i.e. that locks now belong to the entire cog, and can no longer be used as semaphores to protect critical code segments within a cog. I will try to implement my own
I guess the more interesting case is when you need protection across COGs at the same time as within a COG. Some multi-core RTOS or something weird like that.
I have tested running 1500 threads on multiple cogs (I used to only be able to run 80 per cog on the P1!) and it works fine - except for the locks
None of the bit setting instructions have an equivalent though, so you have to use a whole register for each lock.
EDIT: INCMOD/DECMOD can do this too.
No, it is not based on co-routines (if that's what you mean by co-operative). You may have thought so because of the "yield" operations shown in the example. However, these are not necessary, and the program works with them removed - they are included so that a thread that finds it has nothing useful to do can tell the kernel that it can context switch to another thread if there are any waiting (otherwise it does nothing).
But it is also not pre-emptive. There is just a simple round-robin scheduler built into each multi-threading kernel. And yes, on the P1 it works without interrupts. I may modify it to use interrupts on the P2 - in fact, I will need to for the new "NATIVE" mode, when there is no actual kernel that can do the task scheduling.
Ross.
Thanks. I will investigate. However, I have to be able to implement locks without using up cog resources for each one. If you are running thousands of threads and each one needs a lock (for some reason) then you would soon run out of cog resources!
With the P1-style semaphores, I can implement as many thread locks as I need using just one "true" lock and some hub RAM. But this fails on the P2, because the locks are not true semaphores.
There will be a solution - I just don't know what it is yet!
Yes, this might work. I would have to use one hub lock to resolve inter-cog conflicts, plus one register per cog to prevent intra-cog conflicts.
Thanks.
EDIT: Okay, yes, these are the best for the job. BITH both sets the target bit and returns its prior state. Dunno why I thought otherwise now.
Can anyone see any problems, or improve on this?
Thanks!
EDIT: Here's an example using BITH and BITL (limited to 32 locks):
Yes, your bith/bitl solution looks better than mine!
No need to CALL it, even. Just put the instruction wherever it's needed.
And again, if anyone can see something wrong or has an improvement, all suggestions welcome!
EDIT: Oops! Must use wcz with bith. Why?
Plus, there are logical flag operators for TESTB/TESTBN:
I think it should be "if_nz" ... because BITH returns the prior state, not the change of state. C/Z comes back low for a successful try.
Yes, you are correct. Amended.
Here is a more sophisticated multi-threading demo - this program runs 5 multi-threaded kernel cogs (4 started dynamically) and then 50 threads. The threads wander around between the kernel cogs, moving themselves from cog to cog randomly. As usual, this program is compiled for the P2 EVAL board, serial interface, 230400 baud.
The multi-threading support will be part of the next release of Catalina.
Possibly. I'll be able to answer that question better once I have completed the thread support for the new "native" mode ... because there is no kernel in this mode!
The demo program was compiled in "compact" mode, so no pasm. Wait till I finish the other modes (compact mode is always the first one I work on, because it is the easiest).