Yes Dave, I realize that I will need to use locks to implement the proper course of execution. I am just saying that in its current state, it should be pretty much single threaded.
I do believe I've been looking at this from the wrong perspective. Beyond any doubt, the clock timer should be in a continuous loop and therefore it should be run in a seperate cog. However, I no longer believe that the step timer should be in a seperate cog. Each timer has it's own problems to solve and therefore they should be treated differently.
The step timer's only function is to establish a delay in between steps (EDIT: or establish a delay for various heaters), during the normal course of execution. In the original firmware, this delay was set by a function named setTimer within timer.c. The timer in timer.c ran continuously, unless of course it was stopped. With the Propeller, we have the system clock. I now believe that the setTimer function should be moved to dda_queue and look like this:
/// Initialize the next step
void setTimer(uint32_t delay)
{
waitcnt(CNT + delay);
queue_step();
}
or all references to setTimer within the source code should be replaced with this:
waitcnt(CNT + delay);
queue_step();
However, by using the first presented option, execution locks up during the processing of the first G1 command, but I further believe this is where the Propeller locks come into play.
Congratulations Bruce. You are slowly converging to the way the simulator mode runs. The single-threaded mode that I implemented depends on the clock() function to be called periodically. The clock() function is called in the main loop in mendel.c and also from queue_wait() in dda_queue.c. queue_wait() is called from many places. Within the clock() function is a call to sim_time_warp(), which is used to run faster than real-time when time_scale is set to zero. However, it is also used to run at (1/time_scale) times real-time when time_scale is not zero. Normally, time_scale is set to 1 for real-time speed.
So in your implementation you could use the calls to clock to do a similar thing. The drawback to this is that the stepper will stall momentarily if the main loop and the g-code parser do not call the clock() function fast enough. This might happen if an SD read takes longer than normal, or a complex operation is performed to convert a g-code to stepper instructions. Ideally, the stepper code should run in it's own cog, but I suggest getting the code to work first in a single-threaded mode.
Pertaining to my previous post and following my gut, I have altered the project. In this build, I have removed step_timer.c + h. Additionally, I have added the function "void queue_set_timer(uint32_t delay)" to dda_queue.c + h, and all calls within the firmware to the "setTimer" function have been replace with the forementioned function. The function in dda_queue.c + h looks like this:
/// Wait specified time, then call queue_step()
void queue_set_timer(uint32_t delay)
{
waitcnt(CNT + delay);
queue_step();
}
As mentioned in my previous post above, code execution stops during the processing of the first G1:
However, by using the first presented option, execution locks up during the processing of the first G1 command, but I further believe this is where the Propeller locks come into play.
By eliminating step_timer.c + h and it's cog, I am only 388 bytes away from being able to use LOOKAHEAD.
The following configuration is set in config.h:
LOOKAHEAD disabled
DEBUG enabled
GCODE_SD enabled
and the following debug flags are set within the main function:
####WARNING######
If you decide to run this project, please ensure that you set the proper IO pins in config.h!!!
Additionally, I am also using the sample GCODE file attached in post #139 for debugging purposes. This GCODE file only has G1 and M114 commands and it should be useful for nailing down some of the remaining problems within this project.
In an effort to make this project concise and gain extra space for LOOKAHEAD, I have further eliminated sergcode.c + h, in this build. Serial GCODE input will now rely upon the SimpleText getChar() function. However, parse_serial(void) within Forger.c, may require more work, but I will not know until I test serial GCODE input.
With the exception of removing sergode.c + h, this project is the same as above. Please ensure that you set config.h appropriately as stated in my last post.
Oh, and the project is still overflowing by 388 bytes, with LOOKAHEAD enabled, though disabled for this build
Bruce, if you are going to use serial input you probably want to use a serial driver that runs in its own cog. I assuming SimpleText uses the default SimpleSerial driver. This driver will look at the input pin only when you perform a read function, which means you will probably miss the start of input characters when doing g-code stuff. If you look at my simulator/serial_sim.c file you'll see how I did it.
I noticed your code is currently using the SD file system. If you remove this you can save quite a bit of memory. You could write a function that includes your g-code test data as an array of string to replace the SD functions that you are using to access the g-code file.
Bruce, if you are going to use serial input you probably want to use a serial driver that runs in its own cog. I assuming SimpleText uses the default SimpleSerial driver. This driver will look at the input pin only when you perform a read function, which means you will probably miss the start of input characters when doing g-code stuff. If you look at my simulator/serial_sim.c file you'll see how I did it.
Thanks for the heads up on that.... I still have the files and previous posts to reestablish full duplex serial, but I will wait until I get to that point.
I noticed your code is currently using the SD file system. If you remove this you can save quite a bit of memory. You could write a function that includes your g-code test data as an array of string to replace the SD functions that you are using to access the g-code file.
The machine I am building will rely upon SD input instead of serial input. I just added the serial input option to make it more versatile.
It looks like you commented out all the ATOMIC_START/STOP. You are running with multiple cogs, and the ATOMIC stuff is needed. That could be killing things.
I am not 100% certain, but I think this comment was incorrect. I think ATOMIC_START, ATOMIC_END, and the MEMORY_BARRIER() issues, are all going to be AVR specific, and relate to the memory barrier issue. Please refer to memory_barrier.h in the original firmware download. I think the important issues are the use of the sei() and cli() functions, because these functions enable and disable the interrupts.
As it applies to my last upload, here is a list which contains all the remaining discrepancies and locations pertaining interrupts and memory problems. I am also attaching a copy.
It looks like you commented out all the ATOMIC_START/STOP. You are running with multiple cogs, and the ATOMIC stuff is needed. That could be killing things.
I am not 100% certain, but I think this comment was incorrect. I think ATOMIC_START, ATOMIC_END, and the MEMORY_BARRIER() issues, are all going to be AVR specific, and relate to the memory barrier issue. Please refer to memory_barrier.h in the original firmware download. I think the important issues are the use of the sei() and cli() functions, because these functions enable and disable the interrupts.
The atomic operations are required if you run multi-threaded. They are not specific to the AVR. The memory barrier stuff tells the compiler that the following instructions need to be executed in order. I don't think GCC needs this, but I may be wrong. sei() disables interrupts as you said. If you run with multiple cogs you are going to need to simulate this. sei() is implemented in the simulator by using a flag that is checked by the timer routines.
If you use multiple cogs your code will not work properly unless you implement all these features.
The atomic operations are required if you run multi-threaded. They are not specific to the AVR. The memory barrier stuff tells the compiler that the following instructions need to be executed in order. I don't think GCC needs this, but I may be wrong. sei() disables interrupts as you said. If you run with multiple cogs you are going to need to simulate this. sei() is implemented in the simulator by using a flag that is checked by the timer routines.
If you use multiple cogs your code will not work properly unless you implement all these features.
Okay..... I was thinking that the Propeller locks would replace sei() and cli(). Is this incorrect?
As far as I'm know a background loop and an interrupt routine are equivalent to two threads of code running at the same time. They probably share some data. That means they need a mechanism to ensure they don't both touch the same data at the same time.
For the background and interrupt case the enabling and disabling of interrupts does that.
For the multiple thread case, as in multiple COGs on the Prop they need locks.
All this is abstracted away in those ATOMIC_START/STOP functions/macros. They are not AVR specific issues. As such they are the place where the locks should be used. Thus leaving the code that uses ATOMIC_START/STOP unchanged.
That is the name of the game, make as few changes to the higher level code as possible. Push the changes down to the hardware abstractions.
This build reestablishes support for Full Duplex Serial, as per Dave's recommendation. Additionally, this build also includes two new files, which are locks.c + h. A call is made from the init() function of Forger.c to the locks_init() function of locks.c. The locks_init() function establishes two new locks and provides two external lock IDs, for use throughout the project, although not yet implemented in the remaining source code.
The following configuration is set in config.h:
LOOKAHEAD disabled
DEBUG enabled
GCODE_SD enabled
and the following debug flags are set within the main function:
####WARNING######
If you decide to run this project, please ensure that you set the proper IO pins in config.h!!!
Additionally, I am also using the sample GCODE file attached in post #139 for debugging purposes. This GCODE file only has G1 and M114 commands and it should be useful for nailing down some of the remaining problems within this project.
Using locks may be sufficient, but the simulator mode also simulates an interrupt enable flag. With a single core, if my main code disables interrupts I can guarantee that the ISR is not active. However, with multiple cores there is no guarantee that another core is not in the middle of running its ISR routine. I'm not convinced that the Teacup code handles multiple cores correctly. If it were me, I would run the Teacup code as a single thread using the polling capability that it already has. If the single threaded method doesn't generate smooth enough motion I would move some portion of the stepper code to its own cog so that it wouldn't be impacted by the main code.
However, with multiple cores there is no guarantee that another core is not in the middle of running its ISR routine.
I don't understand that. Particularly in the Prop there are no interrupts. That code running in another COG is your "interrupt routine". It starts up whenever some input condition is met.
I think "background" and "ISR" are equivalent to 2 cores/cogs running. Another "ISR" is just another COG. How am I misunderstanding here?
Using locks may be sufficient, but the simulator mode also simulates an interrupt enable flag. With a single core, if my main code disables interrupts I can guarantee that the ISR is not active. However, with multiple cores there is no guarantee that another core is not in the middle of running its ISR routine. I'm not convinced that the Teacup code handles multiple cores correctly. If it were me, I would run the Teacup code as a single thread using the polling capability that it already has. If the single threaded method doesn't generate smooth enough motion I would move some portion of the stepper code to its own cog so that it wouldn't be impacted by the main code.
One thing is for sure, I am definitely swimming in unfamiliar water At the moment, I am just exploring various options, with the little that I know about the subject so far. I believe that memory_barrier.h can be duplicated and therefore I should be able to duplicate ATOMIC_START and ATOMIC_END. Although C++ specific, take a peek at atomicity.h in the SimpleIDE folder. At the bottom of this file is this:
I don't understand that. Particularly in the Prop there are no interrupts. That code running in another COG is your "interrupt routine". It starts up whenever some input condition is met.
I think "background" and "ISR" are equivalent to 2 cores/cogs running. Another "ISR" is just another COG. How am I misunderstanding here?
When I say ISR, I am referring to the AVR ISR functions. In the AVR they are called when an interrupt occurs. On the Prop, they are called from another cog when the system counter reaches a target value. One of the ISR routines just updates some variables that track the time. The other ISR pops things off of a queue and generates stepper motor pulses.
The main thread puts things onto the queue, so it is important that the ISR that pops things off the queue isn't running. On a single-core chip I can guarantee that this code isn't running by disabling interrupts. The act of disabling interrupts means that the processor is running the main thread, and after interrupts are disabled only the main thread will continue to run. As you said, the Prop does not have interrupts, so there is no direct way for a cog to prevent another cog from running. And even if I could pause a cog, it may be in the middle of fiddling with the queues.
Basically, a single-core processor running multiple threads and ISRs is not the same a multi-core system where threads and ISRs are mapped to cores.
I was agreeing with everything you said up to the point of "Basically, a single-core processor running multiple threads and ISR is not the same a multi-core system where threads and ISRs are mapped to cores."
I am totally convinced that they are the same.
Why? Because from the point of view of the code they have no idea if they are running or not. Or when they might run. Or where they are running.
At the end of the day, no matter if we talk ISR's or threads or actual parallel cores running code, the name of the game is the same, they shall not read/scribble on the same data at the same time.
Hence the need for "critical regions" in all cases.
What am I missing here? Aside from the practical differences of achieving "critical regions" which is what those ATOMICxxx are abstracting away.
I am in no way suggesting that I am correct, but this is the way I see it.....
The ISRs are nothing more than a timer routine that may or may not establish or remove interrupt routines. The interrupt routines are equivalent to a lock, to ensure data security.
Frequently, interrupts are being disabled for periods of time in order to perform certain operations without being disturbed; see Problems with reordering code for things to be taken into account with respect to compiler optimizations.
You typically turn off interrupts when you are doing a task that should not be interrupted. One example in most AVR datasheets is reading/writing 16-bit values like TCNT1. Another example might be when you are executing several instructions to push a datum on the front of a FIFO stack and you don't want an interrupt to occur that might try to pop a datum off the FIFO during the middle of your storage instructions.
The ISR are timers. There are two of them A and B. When the timers reach their mark. Code is excuted. Whether these timers operate in seperate threads, well that is another issue.
Why? Because from the point of view of the code they have no idea if they are running or not. Or when they might run. Or where they are running.
At the end of the day, no matter if we talk ISR's or threads or actual parallel cores running code, the name of the game is the same, they shall not read/scribble on the same data at the same time.
The main thread, sets the timers (ISRs), for the sake of discussion, let's say these timers are 555s, when these timers trigger, they send a message "time up, now what", the code attached to the time trigger executes.
I guess my main point is that I can get exclusive access to the machine by disabling interrupts on a a single core device. On multi-core devices with memory caches there is also the issue of cache coherency. So other than that, I agree that they are the same.
The interrupt routines are equivalent to a lock, to ensure data security.
No. Interrupts routines are a way to immediately divert control from whatever you are doing at the time and start doing something else.
That "something else" can Smile all over the data you were working with at the time.
When that "something else", the interrupt handler, has finished you get back to whatever you were doing. But now that data you were working with is different. All is chaos and everything fails.
For this reason we disable interrupts before accessing data that we know the interrupt routine also accesses. When we are done with that we enable interrupts so that "something else" can happen.
Interrupts are the total opposite of "data security" that way. Unless you take care.
I maintain that having two processes in two cogs is the same problem. They can both want to Smile on the same data at the same time.
But in this case we use locks to ensure mutual exclusion rather than disable/enable interrupts.
As such, you should not replace the old calls to ATOMICwhatever in the main code but rather implement locks inside the ATOMICwhatver.
That way the high level code that uses that stuff does not need to change.
Never mind what MFC is doing. That is an even driven system where all the messy details of interrupts and such are abstracted away from the programmer.
Just because I haven't said anything lately, doesn't mean that I don't appreciate your input or that I have been idle. Ever since the last build I have been going in circles.
The initial source of my grief:
/// Wait specified time, then call queue_step()
void queue_set_timer(uint32_t delay)
{
waitcnt(CNT + delay);
queue_step();
}
However, I believe it is causing me grief because it is actually doing something that is causing me further problems. The code is now going through the steps, at least to a certain point and then it abruptly stops. I have installed locks at the discussed locations, but it has not made an obvious difference in how the program runs, so I am uncertain if it is right or wrong.
I do believe the true source of my grief is from the following source in timer.c. This timer as you know is running in a seperate cog and I believe the call to dda_clock is screwing things up. The second lock ID is used strictly in dda_clock() and dda_clock() is only called from the timer.
/// set state of the clock flags
void timer(void *par)
{
while(1)
{
waitcnt((CLKFREQ / 1000 * 2) + CNT);
clock_tick();
dda_clock();
}
}
Dave, I am almost ready to go back and read what you said on how to implement the update clock.
Bruce, as I said in the other thread, the ISR code called by the timer routines doesn't use the ATOMIC macros. The ISR code assumes that it has sole access to the machine, which is valid for a single core. The non-ISR code gets sole access to the machine by performing the ATOMIC_START macro. To simulate this on the Prop you would need to add the ATOMIC macros to the timer code. You could do the following:
/// Wait specified time, then call queue_step()
void queue_set_timer(uint32_t delay)
{
waitcnt(CNT + delay);
ATOMIC_START
queue_step();
ATOMIC_END
}
/// set state of the clock flags
void timer(void *par)
{
while(1)
{
waitcnt((CLKFREQ / 1000 * 2) + CNT);
ATOMIC_START
clock_tick();
dda_clock();
ATOMIC_END
}
}
This is not the ideal solution, but it should solve any parallel access problems with critical variables.
The ISR code assumes that it has sole access to the machine, which is valid for a single core. The non-ISR code gets sole access to the machine by performing the ATOMIC_START macro.
I had not thought of it that way.
One of the biggest problems I have is correctly defining ATOMIC_START and ATOMIC_END.
Comments
The step timer's only function is to establish a delay in between steps (EDIT: or establish a delay for various heaters), during the normal course of execution. In the original firmware, this delay was set by a function named setTimer within timer.c. The timer in timer.c ran continuously, unless of course it was stopped. With the Propeller, we have the system clock. I now believe that the setTimer function should be moved to dda_queue and look like this:
or all references to setTimer within the source code should be replaced with this:
However, by using the first presented option, execution locks up during the processing of the first G1 command, but I further believe this is where the Propeller locks come into play.
So in your implementation you could use the calls to clock to do a similar thing. The drawback to this is that the stepper will stall momentarily if the main loop and the g-code parser do not call the clock() function fast enough. This might happen if an SD read takes longer than normal, or a complex operation is performed to convert a g-code to stepper instructions. Ideally, the stepper code should run in it's own cog, but I suggest getting the code to work first in a single-threaded mode.
As mentioned in my previous post above, code execution stops during the processing of the first G1:
By eliminating step_timer.c + h and it's cog, I am only 388 bytes away from being able to use LOOKAHEAD.
The following configuration is set in config.h:
LOOKAHEAD disabled
DEBUG enabled
GCODE_SD enabled
and the following debug flags are set within the main function:
####WARNING######
If you decide to run this project, please ensure that you set the proper IO pins in config.h!!!
Additionally, I am also using the sample GCODE file attached in post #139 for debugging purposes. This GCODE file only has G1 and M114 commands and it should be useful for nailing down some of the remaining problems within this project.
With the exception of removing sergode.c + h, this project is the same as above. Please ensure that you set config.h appropriately as stated in my last post.
Oh, and the project is still overflowing by 388 bytes, with LOOKAHEAD enabled, though disabled for this build
I noticed your code is currently using the SD file system. If you remove this you can save quite a bit of memory. You could write a function that includes your g-code test data as an array of string to replace the SD functions that you are using to access the g-code file.
Thanks for the heads up on that.... I still have the files and previous posts to reestablish full duplex serial, but I will wait until I get to that point.
The machine I am building will rely upon SD input instead of serial input. I just added the serial input option to make it more versatile.
I am not 100% certain, but I think this comment was incorrect. I think ATOMIC_START, ATOMIC_END, and the MEMORY_BARRIER() issues, are all going to be AVR specific, and relate to the memory barrier issue. Please refer to memory_barrier.h in the original firmware download. I think the important issues are the use of the sei() and cli() functions, because these functions enable and disable the interrupts.
As it applies to my last upload, here is a list which contains all the remaining discrepancies and locations pertaining interrupts and memory problems. I am also attaching a copy.
If you use multiple cogs your code will not work properly unless you implement all these features.
Okay..... I was thinking that the Propeller locks would replace sei() and cli(). Is this incorrect?
As far as I'm know a background loop and an interrupt routine are equivalent to two threads of code running at the same time. They probably share some data. That means they need a mechanism to ensure they don't both touch the same data at the same time.
For the background and interrupt case the enabling and disabling of interrupts does that.
For the multiple thread case, as in multiple COGs on the Prop they need locks.
All this is abstracted away in those ATOMIC_START/STOP functions/macros. They are not AVR specific issues. As such they are the place where the locks should be used. Thus leaving the code that uses ATOMIC_START/STOP unchanged.
That is the name of the game, make as few changes to the higher level code as possible. Push the changes down to the hardware abstractions.
This build reestablishes support for Full Duplex Serial, as per Dave's recommendation. Additionally, this build also includes two new files, which are locks.c + h. A call is made from the init() function of Forger.c to the locks_init() function of locks.c. The locks_init() function establishes two new locks and provides two external lock IDs, for use throughout the project, although not yet implemented in the remaining source code.
The following configuration is set in config.h:
LOOKAHEAD disabled
DEBUG enabled
GCODE_SD enabled
and the following debug flags are set within the main function:
####WARNING######
If you decide to run this project, please ensure that you set the proper IO pins in config.h!!!
Additionally, I am also using the sample GCODE file attached in post #139 for debugging purposes. This GCODE file only has G1 and M114 commands and it should be useful for nailing down some of the remaining problems within this project.
I think "background" and "ISR" are equivalent to 2 cores/cogs running. Another "ISR" is just another COG. How am I misunderstanding here?
One thing is for sure, I am definitely swimming in unfamiliar water At the moment, I am just exploring various options, with the little that I know about the subject so far. I believe that memory_barrier.h can be duplicated and therefore I should be able to duplicate ATOMIC_START and ATOMIC_END. Although C++ specific, take a peek at atomicity.h in the SimpleIDE folder. At the bottom of this file is this:
The main thread puts things onto the queue, so it is important that the ISR that pops things off the queue isn't running. On a single-core chip I can guarantee that this code isn't running by disabling interrupts. The act of disabling interrupts means that the processor is running the main thread, and after interrupts are disabled only the main thread will continue to run. As you said, the Prop does not have interrupts, so there is no direct way for a cog to prevent another cog from running. And even if I could pause a cog, it may be in the middle of fiddling with the queues.
Basically, a single-core processor running multiple threads and ISRs is not the same a multi-core system where threads and ISRs are mapped to cores.
I was agreeing with everything you said up to the point of "Basically, a single-core processor running multiple threads and ISR is not the same a multi-core system where threads and ISRs are mapped to cores."
I am totally convinced that they are the same.
Why? Because from the point of view of the code they have no idea if they are running or not. Or when they might run. Or where they are running.
At the end of the day, no matter if we talk ISR's or threads or actual parallel cores running code, the name of the game is the same, they shall not read/scribble on the same data at the same time.
Hence the need for "critical regions" in all cases.
What am I missing here? Aside from the practical differences of achieving "critical regions" which is what those ATOMICxxx are abstracting away.
The ISRs are nothing more than a timer routine that may or may not establish or remove interrupt routines. The interrupt routines are equivalent to a lock, to ensure data security.
The main thread, sets the timers (ISRs), for the sake of discussion, let's say these timers are 555s, when these timers trigger, they send a message "time up, now what", the code attached to the time trigger executes.
That "something else" can Smile all over the data you were working with at the time.
When that "something else", the interrupt handler, has finished you get back to whatever you were doing. But now that data you were working with is different. All is chaos and everything fails.
For this reason we disable interrupts before accessing data that we know the interrupt routine also accesses. When we are done with that we enable interrupts so that "something else" can happen.
Interrupts are the total opposite of "data security" that way. Unless you take care.
I maintain that having two processes in two cogs is the same problem. They can both want to Smile on the same data at the same time.
But in this case we use locks to ensure mutual exclusion rather than disable/enable interrupts.
As such, you should not replace the old calls to ATOMICwhatever in the main code but rather implement locks inside the ATOMICwhatver.
That way the high level code that uses that stuff does not need to change.
Where nIDEvent would equal ISR_A or ISR_B
Never mind what MFC is doing. That is an even driven system where all the messy details of interrupts and such are abstracted away from the programmer.
The quotes from post #168 are straight from the Arduino website
I see nothing in those quotes that contradicts what I have been saying. Please do explain.
Documentation supports that statement. So setting a lock should be equivalent to disabling interrupts.
EDIT: and vice-versa
Just because I haven't said anything lately, doesn't mean that I don't appreciate your input or that I have been idle. Ever since the last build I have been going in circles.
The initial source of my grief:
However, I believe it is causing me grief because it is actually doing something that is causing me further problems. The code is now going through the steps, at least to a certain point and then it abruptly stops. I have installed locks at the discussed locations, but it has not made an obvious difference in how the program runs, so I am uncertain if it is right or wrong.
I do believe the true source of my grief is from the following source in timer.c. This timer as you know is running in a seperate cog and I believe the call to dda_clock is screwing things up. The second lock ID is used strictly in dda_clock() and dda_clock() is only called from the timer.
Dave, I am almost ready to go back and read what you said on how to implement the update clock.
I had not thought of it that way.
One of the biggest problems I have is correctly defining ATOMIC_START and ATOMIC_END.
The current definition contains SREG and cli();