PropGCC COG debug protocol
ersmith
Posts: 6,088
Here's a description of the debug protocol used by propeller-elf-gdb. The implementation is still incomplete (gdb doesn't do multi-cog debugging yet, and only does breakpoints in LMM mode) but it might be useful for other languages on the Propeller.
The intention is that the host can easily distinguish COG packets from text, so the same serial port can be used for debug and communication.
In the description below "host" is the PC, and "device" is the Propeller.
The intention is that the host can easily distinguish COG packets from text, so the same serial port can be used for debug and communication.
In the description below "host" is the PC, and "device" is the Propeller.
Cog debug protocol: Packets from the host to the device are always at least 3 bytes long. - The first byte is 0xFD - The next byte has the command in the high nybble, and target COG in the low nybble. If the target COG is 0x0F then it is a "broadcast" packet intended for all COGs. - The next byte is a count of how many bytes are left. Packets from the device back to the host may be a variable number of bytes, but always start with 0xF8-0xFF followed by the COG id. The following commands are understood: - 0x00: DBG_CMD_STATUS: request cog id, flags, and PC device responds with 5 bytes: 0xF8 II FF YY XX where II is the cog number FF is the current cog status bits (see below) XXYY is the current cog PC This command is intended to assist with debugging COG code cog status bits: bit 0: carry flag (C) bit 1: nz flag (!Z) bits 4-5: propeller version: 0=prop1, 1 = prop2 bit 7: single step bit (LMM kernel only) all other bits are reserved - 0x10: DBG_CMD_RESUME: resume execution device does not immediately respond, but sends a status byte when a breakpoint is hit - 0x20: DBG_CMD_READCOG: read cog data This request has a length of 3 (3 additional bytes), consisting of the number of bytes to read (1 byte) plus the number of the COG register to start reading at (note that this is a COG register number, not a byte address). The number of bytes to read must be a multiple of 4. So for example to read register 16 (the C stack pointer) from COG 1, the device would send: 0xFD 0x21 0x03 0x04 0x00 0x10 which is "read 4 bytes from register 16" The device responds with 0xF9 II XX XX XX ... where II is the COG number, and XX...XX are the N bytes of data requested by the read. - 0x30: DBG_CMD_WRITECOG: write cog data This request has a length of at least 6 (the COG address followed by 4 bytes of data). The length will be 2 + number of bytes. So for example to write 0x12345678 to register 1 in COG 7, the device would send: 0xFD 0x37 0x06 0x00 0x01 0x78 0x56 0x34 0x12 The device responds with 0xFA (ack), followed by the COG number, followed by a single byte checksum of data received (at present the checksum is always 0, so should be ignored) - 0x40: DBG_CMD_READHUB: read hub data Very similar to DBG_CMD_READCOG, except the address is 4 bytes (so the command length is 5, 1 byte data length plus 4 bytes addres). The address is a hub byte address, and any number of bytes between 1 and 255 may be sent. - 0x50: DBG_CMD_WRITEHUB: write hub data - 0x60: DBG_CMD_QUERYBP: query the breakpoint command Requests 4 bytes that may be used as a breakpoint. The device response is 0xF9 II ww zz yy xx, where xxyyzzww is a breakpoint instruction. - 0x70: DBG_CMD_LMMSTEP: single step the LMM interpreter The device response happens after the step, and is the same as for DBG_CMD_STATUS. - 0x80: DBG_CMD_LMMBRK: set the hardware breakpoint address This requests that the LMM interpreter break when the PC gets to a certain value. The length is 4 (4 additional bytes) consisting of the 4 byte PC value to break at. The response is 0xFA (ack) followed by cog number and a checksum of the address (not implemented yet, checksum is always 0).
Comments
Does that follow some sort of gdb standard ?
What baud rates can the Prop code support ? (assuming a 80Mhz fSys)
A single serial port will always be compromised, but does give a simple basic level.
The best solution would come from a dual-serial link, with some modest local MCU assisting the Prop code.
eg a Duplex PC (USB) to Dual- UART, with one a high-speed half-duplex gdb co-operating 'special UART'
At the moment the code in PropGCC is fixed to 115200. The protocol itself is half duplex, so higher rates should be easy to achieve.
Not quite sure what you mean there. Packets all start with 0xF8-0xFF, so they don't interfere with ASCII (or indeed with most of UTF-8). It is assumed that text and packets are not interleaved, i.e. once a packet starts the rest of the transmission until end of packet come from the debugger. I don't think that's too onerous an assumption -- in practice only one COG at a time can drive the serial port, and once a COG is in the debug stub it can send a whole packet before returning.
I agree, but it's hard enough to get people to use a basic debugger -- getting them to use additional hardware is probably a non-starter.
Better debug hardware would definitely be an interesting addition for the P1V or P2.
Eric
Might need a small change now, to support P2 better ?
When the protocol was designed P2 just had 8 cogs. I've given up trying to anticipate what P2 will have -- let's see its features when it arrives.
We've used CP2105 for two-channel link cases, very similar to a Debug design, but of more interest today is the new EFM8UB1 from Silabs. - that can emulate CP2105, but allows user code at the UART pins.
Should be under $1 in moderate volumes, so same/cheaper than the FTDI device used now.
Something that replaces the FTDI, avoids the 'additional hardware' issue longer term.
On basic code I have that over 1MBd, and I have reports of 4.8Mb/s streaming (half duplex) I hope to verify soon.
Baud choices granularity on that part, are 48M/(2*n), where n >= 3 on receive.
It can also do 9 bit and 1,2 Stop bits
Perhaps 9 bit mode could give one way to (optionally) split/share fewer pins for Debug ?
Yes, but the records could support P2 now, on builds going forward ?
eg PC on P2 ? - 20b can fit in present record, using the upper nibble of COG, but supporting 32b would need a record extension.
Perhaps MSB of II field, which seems always sent, could tag 16b(P1) or 32b(P2,P3) records ?
That's backward compatible, but future proof too.
Getting debug support working on P2 FPGA image, I would expect to be something of a priority when it is released.
True, I suppose for example 16 COGS may not fit, and P2 may come with 15 COGs (or some other number)
I'm sure it will not be totally wasted, there will be a lot still common on any eventual P2.
I would also not expect much to change around Debug-record support from FPGA image release *
Main question I see is around what PC size to use, and whilst 512K is on-chip, there are enough off-chip modes already to make support of 32b PCs more logical. Get a fast enough link, and the overhead matters even less.
* Of course, On-Silicon-Debug over 1~2 wires would be great, but even if that magically arrived, the PC side would likely handle the same records, as the Silicon-debug link layer would be done by the local MCU.
https://github.com/parallaxinc/propgcc
https://github.com/dbetz/propeller-gcc
https://code.google.com/p/propgcc/w/list
this thread (http://forums.parallax.com/showthread.php/160767-PropGCC-COG-debug-protocol)
and the "propgcc now on github" thread (http://forums.parallax.com/showthread.php/160431-propgcc-now-in-the-Parallax-github)
I'm sure you two know of other helpful links.