Shop OBEX P1 Docs P2 Docs Learn Events
Software Defined Radio - thoughts on P2 application — Parallax Forums

Software Defined Radio - thoughts on P2 application

bob_g4bbybob_g4bby Posts: 412
edited 2020-07-10 17:50 in Propeller 2
Just thinking aloud here,

Over the past 20 years, the amateur radio scene has seen a disruptive change in hardware design brought about by software defined radio. Hams pioneered it in fact. Here's an article that introduced it to many of us https://sites.google.com/site/thesdrinstitute/A-Software-Defined-Radio-for-the-Masses. No more superheterodyne circuitry.

The sdr principle is simple: On receive a small part of the radio spectrum (say 3.7-3.72 MHz) is mixed with an RF oscillator (say at 3.7MHz). After low-pass filtering the difference signal spans 0-8kHz or maybe as high as 0-192kHz. Actually two signals are produced, 90deg out of phase with each other using a Tayloe detector (Dan Tayloe perfected it). These two signals are digitised by a cheap stereo, low-noise 24bit sound card chip into two streams of integers, usually labelled I (in phase) and Q (quadrature).

From there on, the signal path is all digital until the sound output dac, amplifier and speaker. The platforms that hams have used range from multicore PCs, through RasPIs down to DSPIC33 chips. A pal of mine in northern England has developed an HF radio with the DSPIC33 - he can be heard most weeks chatting with other hams all over Europe. He gets good signal quality reports. He's also a keen cave explorer, and has developed 'cave radios' for surface-to-ground communications and for cave surveying. (VLF signals penetrate rock to a degree). He runs his radios at between 8k - 16ksamples/second - adequate for the 3kHz wide signals used on-air.

He's programmed the dsp on a 70MHz dspic33 in C, on a sample-by-sample basis - he doesn't process arrays of samples due to compute/memory constraints. The DSPIC is used to control the radio hardware as well - tuning, band change etc. He's recently switched to using a dual-core DSPIC and split the tasks. One core does the hardware control and the other the DSP. He still complains that the cpus are only 16 bit and that requires all sorts of maths tricks to avoid signal degradation. Quite a lot of original work and also looking back to the older dsp techniques for ideas - he's very clever to get so much out of that cpu.

So it begs the question: what could be achieved with a 200MHz 32 bit P2 with cordic engine and dsp instructions? The signal processing breaks down into a number of stages - maybe around six - for which the COGs would seem to be a natural fit, data being handshaked between the stages in a pipeline. Probably an external A/D would still be needed for receiver input because of very low noise requirement, but the P2s' internal converters could be used for mic input, speaker output and possibly transmit output (the latter then being mixed up to the required RF band by other circuitry). DSP sample rates might improve on the 16ksample/s that my pal achieved. 32bit arithmetic with 64 bit intermediate results would be much easier to manage. There might be enough processing time spare to do a spectrum display - always good for searching for signals.

Most of the DSP primitives would be assembly code for speed. Gluing the primitives together to make the complete radio - TAQOZ every time - forth is great for interactive prototyping. To provide stimuli and measure responses, the programming language LabView for Windows is fantastic and the Community edition is free for non-commercial use. It suits hardware engineers very well; like forth you get quick results. High speed serial link(s) could link P2 and PC during development to exercise the dsp in real-time. For a standalone radio, then there are plenty of small displays we know the propeller can drive and would make good front panels.

The latest SD radios use very high speed A/Ds to convert the whole of the short wave band to a digital stream which is then processed by a large fpga. Both these devices are power hogs, expensive and a dog to develop - the fpga tools are very slow. Thus this technique is not the best for portable narrowband equipment, but great for high speed data and the intelligence community!

If we stick to the lower speed sdr technique, the P2 is more simple to program than most cpus and well suited for battery operation, 2-3W at most. Not forgetting it would save quite a few other parts too.

I'm looking forward to see how much 'radio' can be squeezed into the P2 - I'm sure it's more than people think. Any other radio heads - have you been thinking about this too?

Cheers, Bob G4BBY

Comments

  • RaymanRayman Posts: 14,640
    Have no idea, but sounds simple enough!

    This is AM modulation, sounds like, right?

    Wonder if you could generate the 3.7 MHz with the Prop2... Well, I'm sure you could, but maybe have to pick a special crystal frequency...

    ADC with 20 kHz bandwidth sounds like should work...

    But, googling, I found this right away:
    https://www.analog.com/media/en/training-seminars/design-handbooks/Software-Defined-Radio-for-Engineers-2018/SDR4Engineers.pdf

    It talks about this ADF7030 chip. Looks like it does everything for $5.
    Hard to beat that...

    Looks like you could control it over SPI with your P2 though...
  • RaymanRayman Posts: 14,640
    Wait, that chip is at a different frequency, I think...
    Actually, Mouser doesn't seem to have anything that works at 3.7 MHz...
  • RaymanRayman Posts: 14,640
    Actually, I remember something now from last time I toyed with looking at this...

    Can the RTL-SDR work for this?
    This one appears like it could: https://www.amazon.com/RTL-SDR-Blog-RTL2832U-Software-Defined/dp/B011HVUEME

    Would be a LOT easier, if it did...
  • RaymanRayman Posts: 14,640
    For 80 m, looks like you have to do a hardware mod for it to work:
    https://www.rtl-sdr.com/rtl-sdr-direct-sampling-mode/
  • PropGuy2PropGuy2 Posts: 360
    edited 2020-07-09 18:55
    Actually I have been working on a HF / SSB radio design for the past year or so. I completely ran out of memory with the P1. And now ramping up with the P2 chip. It is surprising how difficult and expensive it is to design a SSB rig. The software time and component cost rise quickly. This especially true if you are planning on a medium to large production run, mostly to compete with the ICOM, Kenwood and Elecraft that start at $2000. $3000, $4000 Thousand dollars? The radio I am working on will have a price point of between $400 and $500 Dollars, using the P2 chip and traditional superheterodyne circuitry. To get there and to beat out the Chinese, it will be minimal, SSB & CW, 50 Watts out and easy to use (think Heath Kit and Knight Kit back in the day) Would anyone buy a radio like this or is this pipe drams for today's hams, beginner or not?
    1789 x 1135 - 519K
    2939 x 2286 - 1M
  • bob_g4bbybob_g4bby Posts: 412
    edited 2020-07-09 19:58
    You most probably wouldn't attempt direct RF generation with the P2, there'd be too many low level spurious signals. You'd use the P2 to generate a baseband signal that is up-converted with a low-noise RF oscillator to the required band. We know this works. Popular RF oscillators tunable by the P2 over a wide frequency range are the SI5131 (cheap) and earlier SI570 (a bit more money).

    My pal has developed code to receive and transmit AM, FM, single sideband, double sideband, CW and so on. That's 'just' down to the maths.

    I like the ACORN - the 50W output is a plus, many such radios are low power only. Great for portable. The chinese transceivers are difficult competitiion even if some of them have poor signal purity.

    I use a Multus Proficio, bought as semi-populated pcb together with a 7" chinese tablet - see www.qrz.com and type in my callsign G4BBY. It only delivers 6W, so I added a 100W amplifier kit.


  • The P2 ADC can sample 3.7MHz directly. I'll make no claims whether the sensitivity is good enough for radio reception. :wink: It works well for a 1Vpp video signal.

    The P2 should be able to generate a fairly clean RF signal directly.* The streamer has Goertel mode which was intended to output sine waves and observe the output. There is also the chroma modulator, which is a quadrature modulator. Both of these operate like a Direct Digital Synthesizer.

    *This assumes you don't use the P2 PLL. It's phase noise is not quite good enough. The P2 could use a 80-125MHz crystal oscillator to keep the phase noise under control. Unfortunately, the reduced clock rate will reduce ADC performance. But we can run 4 ADCs in parallel. There is a chance that the DAC is not up to serious RF usage.

    Goertzel mode has a slight annoyance for RF receiver usage: the sampling rate will be based on a whole number of cycles of the digital oscillator. So when changing tuning frequencies, the sample rate will change as well.
  • A lot of sdr 'application envelope exploration' to be done then! Hams are notoriously ingenious at repurposing chips for radio use. e.g. The Hermes-lite hf band sdr is based on a broadband modem chip, it makes a good performer too. The smartpin systems will have many people thinking hard.

    As regards P2 a/d for receiver input - it does need trying, not to dismiss it before that. I wonder whether the P2s' extensive power supply system and decoupling network helps tame noise levels enough?

    Running ADCs in parallel - my chinese oscilloscope does that to achieve higher performance at lower cost.

    Most multiband HF sdrs cover the range 2-30MHz. Some to 60MHz.

    I remember when PC based sdrs first appeared, that the mother-board sound chips were noticeably noisier than the high end sound cards from M-Audio and the like. Better than 16 bit, with the lowest s/n spec and highest dynamic range were the norm for best HF sdr, before high speed direct sampling took over.
  • Starred... this is an area I'm interested in myself, though primarily the ISM bands (I'm not licensed). There are a lot of transceiver chips like the CC1101 out there that can be used with the Prop for VHF/UHF, at least for digital stuff. I started a project a few months ago that uses one to sweep through a band and plot a crude waterfall based on the signal strength: http://forums.parallax.com/discussion/171460/p2-based-waterfall-display-for-ism-band-radios#latest
    Haven't done much with it lately...couldn't nail down the cause of the staircase effect on some signals.

    I'd love to see what can be done in HF though...every once in awhile I'll lurk on HF to see what parts of the world I can pick up here.
  • Maybe the P2 could interface to a Lime LMS8001+chip (100 kHz – 12 GHz)

    LMS8001 is a single chip up/down RF frequency shifter with continuous coverage up to 10 GHz.

    https://limemicro.com/technology/lms8001/

    Specs:https://limemicro.com/technology/

    It also has a companion board:

    LMS8001 Companion

    The LMS8001 Companion board provides a highly integrated, highly configurable, four-channel frequency shifter platform, utilising the LMS8001A integrated circuit.

    https://www.crowdsupply.com/lime-micro/limesdr-mini/updates/lms8001-companion-extends-coverage-to-10-ghz





  • bob_g4bbybob_g4bby Posts: 412
    edited 2020-07-19 22:48
    Just thinking where the P2 scores over the alternatives for sdr - there are so many cheap computing platforms about these days - why choose P2?:-

    1. Lower power for battery operation - run the cpu at the lowest clock speed the application needs. Avoid unnecessary activity within the chip
    2. Control of SPI devices such as RF oscillators, low noise A/D by SMARTPIN - no driver chip needed and reduces code size and retains full bus speed
    3. Measurement of V or I within radio requires no extra a/ds - for the transmitter o/p power meter
    4. Rotary user controls can be read - no extra parts and little software needed to read shaft encoders or analogue voltage thanks to SMARTPIN
    5. No external decoders for bandswitch selection needed - a massive 64 I/O pins to use!
    6. Saving radio settings - very good flash memory support
    7. Saving and playing signal recordings, station frequency lists, providing bespoke radio applications, upgrades, bug fixes - very good SD card support
    8. Microphone input - no external a/d needed
    9. Speaker output - no external d/a needed
    10. Choice of sample-by-sample or 2^n buffer by buffer dsp - no need for external RAM, internal 512kbyte hub ram means there is enough room for either
    11. Front panel display - no need for driver chips, good hardware / software support of many display types from 2 line LCD to hdmi
    12. P.C. link - USB / serial support from SMARTPIN to enable PC control, data transfer
    13. The above chip savings means a smaller pcb is required, so good for handheld devices
    14. Easy expansion - Two or more P2s can communicate over high speed serial link to share the radio processing or provide data decoding - SMARTPIN makes this trivial
    15. The cordic solver and other dsp related instructions bring fft and other time consuming dsp at audio bandwidths within reach on a low cost, low power platform
    16. Faster dsp - use of N multiple COGS linked in a signal 'pipeline' greatly reduces the time taken to process a signal buffer as compared with single core platforms like dspic33
  • PropGuy2PropGuy2 Posts: 360
    edited 2020-07-12 12:06
    bob_g4bby - This is exactly why I have been working (and waiting) for the P2 chip. I already have partnered with a super experienced RF engineer type. It is really a mixture of hardware and software, and a good processor. Noise in general , and phase stuff is always a problem. Bottom line is, that I may or may not use all the capabilities of the P2 chip. The reason being I want to make the coding as simple as possible, so other hams can modify the program to suit their needs. Sure there will be a robust operating system in place, but no tricky algorithms. I am planning on a 5 to 10 year run on the SSB / HF design I am working on. FYI -the reason I am NOT using DSP / SDR stuff is that these chips may or may not be in production in the long term. Tech advances and stuff goes End of Life - A problem I am having now with components made in China.
  • bob_g4bbybob_g4bby Posts: 412
    edited 2020-07-14 18:47
    Having read through the P2 forum a bit more about the internal a/ds, I stand by what I said about using an external stereo a/d for receiver input. An internal a/d will be far too noisy to encode incoming signals which range from microvolts to fractions of a volt, 0Hz to maybe as high as 192kHz. Here's a good paper that explains a bit about that http://gnarc.org/wp-content/uploads/2015/02/Software-Defined-Radio-SDR-for-Amateur-Radio-2015-02-11.pdf. An external a/d can be placed in a quiet area of groundplane located near the tayloe down-converter, heavily decoupled power, and away from the P2 environment.

    The P2 still reduces the parts count overall and promises some serious signal processing.
  • The Kahn method of generating single side band modulation is interesting because it allows the use of a Class E transmitter final stage (90% efficient) instead of a linear amplifier (50% efficient). This means no large, hot, heavy heatsink and 1/2 the power supply requirement. Lower cost fast switching transistors needed instead of RF linear types as well. Here's a paper on it

    The amplifier is driven by a frequency modulated signal and powered from an amplitude modulated source, which 'magically' combine to produce SSB. This forum is developing a small HF transceiver using that principle, albeit with a very poor choice of processor ;-)

    Why mention this in a P2 forum? The microphone input is passed through a hilbert transform to produce in-phase and quadrature signal streams. These are converted to polar format ( amplitude / phase ) : The amplitude signal then varies the power amplifier supply voltage. The phase signal is used to vary the frequency of an RF oscillator, most suitable are direct digital synthesis devices which are capable of glitch free, rapid frequency changes but it's possible things like SI5351 would do. So the P2 cartesian to polar instruction in the cordic solver would speed the signal path significantly and the PWM feature of a smartpin would be useful for the power supply drive signal. Makes yer fink, dunnit?
  • bob_g4bby wrote:
    The microphone input is passed through a hilbert transform to produce in-phase and quadrature signal streams.

    Actually, there's a method that works better than the Hilbert transform for producing quadrature signal components. I describe it here:

    https://forums.parallax.com/discussion/comment/1020855/#Comment_1020855

    It's based upon this paper by Clay S. Turner:

    https://www.iro.umontreal.ca/~mignotte/IFT3205/Documents/TipsAndTricks/AnEfficientAnalyticSignalGenerator.pdf

    -Phil
  • PropGuy2 wrote: »
    Actually I have been working on a HF / SSB radio design for the past year or so. I completely ran out of memory with the P1. And now ramping up with the P2 chip. It is surprising how difficult and expensive it is to design a SSB rig. The software time and component cost rise quickly. This especially true if you are planning on a medium to large production run, mostly to compete with the ICOM, Kenwood and Elecraft that start at $2000. $3000, $4000 Thousand dollars? The radio I am working on will have a price point of between $400 and $500 Dollars, using the P2 chip and traditional superheterodyne circuitry. To get there and to beat out the Chinese, it will be minimal, SSB & CW, 50 Watts out and easy to use (think Heath Kit and Knight Kit back in the day) Would anyone buy a radio like this or is this pipe drams for today's hams, beginner or not?

    I'd be a buyer. Sounds like a great project and involves two of my favorite things, radios and Propellers! Please keep us informed, I'd love to follow along.

  • Thanks for the up vote Jonathan - Right now I am wanting to become super familiar with the Smart Pin ADC & DAC functions. Likewise for a decent / clean / "beautiful" VGA tile driver. No one sees my exquisite PCB circuit, no one sees my agonizingly crafted SPIN2 code, everyone sees the enclosure and operators screen. If anyone finds a VGA driver or any SPIN2 code snippets for ADC /DAC/ SD / VGA / i2c / SPI, that actually work no matter what, pass them along.
  • Download Pnut

    http://forums.parallax.com/discussion/171196/pnut-spin2-latest-version-v34u-debugger-improved/p1

    It has

    ADC_to_VGA_millivolts.spin2

    VGA_1280x1024_text_160x85.spin2

    vga_text_demo.spin2

    Plus others
  • bob_g4bbybob_g4bby Posts: 412
    edited 2020-12-10 20:35
    Some questions on using the cordic engine in assembly language or TAQOZ - I see a cog can issue a cordic instruction, and 55 clocks later, the answer can be read with GETQX and GETQY. Those instructions wait for the results if the engine is still crunching data, so do some other stuff meanwhile, to not waste cpu time.

    When I eventually get a P2, I'd like to use multiple cogs, arranged in a signal chain, each doing a part of the dsp needed for software radio. I can see that the cordic engine would be useful in many parts of the signal path.

    1. Can the cordic engine be shared by cogs?
    2. Can cordic instructions from different cogs be interleaved?
    3. Is the result automatically routed back to the caller as if it alone were using the engine?
    4. How can the program sense the % utilisation of the engine?

    Cheers, Bob
  • Cluso99Cluso99 Posts: 18,069
    edited 2020-12-10 22:45
    Bob,
    1 & 2. From my understanding, a new cordic instruction can be issued on every clock. However, as each cog really only gets a slot every 8 clocks, a cog can only issue a cordic instruction every 8 clocks, although the other cogs can intersperse cordic instructions between these.

    3. yes
    4. only be software but no need to do this
  • Thanks Cluso, that's very interesting. I can start planning some more, bearing in mind those limits.
  • Phil,

    The P1 wasn't fast enough to receive WWV, but what about the P2.

    I would love to be able to calibrate timers such as the mechanical stopwatch I just got.
  • bob_g4bby wrote: »
    1. Can the cordic engine be shared by cogs?
    2. Can cordic instructions from different cogs be interleaved?
    3. Is the result automatically routed back to the caller as if it alone were using the engine?
    Yes to all. As long as you don't start more than one instruction at a time per cog everything is totally transparent. The cog/CORDIC interface automatically inserts waitstates until the CORDIC unit is ready to accept a new command or deliver the result.

    As Cluso already said each cog gets access to the CORDIC every 8th clock cycle. So it doesn't matter how many other cogs use it. It won't slow down computation.

    You only have to be careful if you interleave CORDIC commands on the same cog, I mean if you start another computation before the first (of the same cog) is finished. This is possible but you have to disable interrupts to avoid loosing results due to pipeline "traffic jam".
    4. How can the program sense the % utilisation of the engine?
    Might be possible but I don't know how. But theoretically utilisation could be analyzed at compile time.
  • bob_g4bbybob_g4bby Posts: 412
    edited 2020-12-11 15:04
    The signal chain of cogs in a software radio will be doing external a/d data acquisition, iq balancing, correlation based passband filtering and frequency shifting, automatic gain control, noise reduction, auto notch filtering, demodulation and smartpin d/a conversion for speakers, smartpin a/d conversion for microphone + others I've forgotten. Each cog will probably end up doing 1-3 of the above stages. My preference is to write the software in TAQOZ, with each signal step being a separate word written in assembler so they can be easily tested stage by stage from the system terminal (test signals being produced and detected by Labview programs on the host PC). The non-signal path part of the radio can be written in high level Taqoz, as it's not time critical. So as to spend as much time in assembler code and as little time as possible in 10x slower, high level Taqoz, it's essential to process the incoming signal stream in buffers, not as individual samples. The buffer size would be normally anything between 512 and 4096 samples, but bearing in mind the 512 kbyte ram limit. So cordic engine usage by any one cog would be simple - process a buffer through step n, then process the result through step n+1 and so on. In other words, a linear stream of cordic instructions. I foresee the cordic engine being run at full capacity - and it will influence which sdr methods are used, because it's much faster at what it does than a cog, so use it whenever you can.

    Other basic thoughts - the P2 is specced at 180MHz but the prototype version C has been overclocked by several folks to 300MHz. Assuming the incoming I-Q signal stream to be 96 ksamples/s max, the number of (typical) 2-clock instructions per I-Q sample pair would be 937 instructions @180MHz and 1562 instructions when pushed to 300MHz.
    With (say) a 2048 long buffer, each cog has to perform what it needs to do in 1920 kilo instructions @ 180MHz clock to 3200 kilo instructions @ 300MHz. Some initial tests in performing an FFT on an I-Q buffer would give some idea as to whether the P2 is up to the type of sdr type I know most about, which typically runs on a PC. This sounds like a tall order - but sdrs running on PCs are typically using 10 - 15% of the cpu resource to avoid gaps in reception due to Windows not being a real-time OS. However, the P2 can be run at very nearly full load if no interrupts are used in the signal path. There are plenty of examples of working sdrs on DSpic33 processors - these are less powerful than the P2, so the P2 is likely to be up to the task and then some I suspect.

    Because the complete application may end up pushing the P2 pretty hard, assume it will get warm. Choose a development board with higher power regulators, good decoupling and a decent 'heatplane'.




  • bob_g4bby wrote: »
    Other basic thoughts - the P2 is specced at 180MHz but the prototype version C has been overclocked by several folks to 300MHz. Assuming the incoming I-Q signal stream to be 96 ksamples/s max, the number of (typical) 2-clock instructions per I-Q sample pair would be 937 instructions @180MHz and 1562 instructions when pushed to 300MHz.
    With (say) a 2048 long buffer, each cog has to perform what it needs to do in 1920 kilo instructions @ 180MHz clock to 3200 kilo instructions @ 300MHz. Some initial tests in performing an FFT on an I-Q buffer would give some idea as to whether the P2 is up to the type of sdr type I know most about, which typically runs on a PC.

    Sample FFT code, not fully optimized:
    https://forums.parallax.com/discussion/170948/1024-point-fft-in-79-longs
  • Very interesting, TonyB - 79 longs! Compact. I'll give it a good looking at, thank you.
  • bob_g4bbybob_g4bby Posts: 412
    edited 2021-05-10 06:25

    How to stream audio data back and forth between LabView on a Windows PC and TAQOZ on the P2? Well, the terminal serial port 'only' runs at 921600 baud, but that's not fast enough for realtime.
    In the 'good old days' experimenters used to use the 8 bit 'parallel printer port' to drive external logic, it was quite simple to do. Nowadays, Windows tends to prohibit such shenanigans.

    I had bought a PCIe parallel port card a while ago, but my favourite language Labview 2020 no longer supports reading and writing to the bare parallel port. So I wrote my own LabView interface:-

    1. install the parallel port card driver. MAKE SURE THE CARD CHOSEN WORKS AT THE LOGIC LEVELS TO SUIT YOUR PROJECT - 3.3V in my case.
    2. install this parallel port tester with this dll needed
    3. check the card is functioning with the parallel port tester by measuring the outputs on the D type connector pins - the outputs levels were actually 3.3V as claimed - very important so you don't fry a P2 or whatever
    4. I wrote two calls to the dll from Labview to output or input a byte. The outputs appear at the base address, inputs appear on base address+1, base address+2

    The two vis I wrote are attached, together with a demo that just toggles the outputs slowly. LabView 2020 Community edition is free from National Instruments for non-commercial use - the quickest way to write a measurement and control program I know - and it's not crippled in any way.

    Postscript: LabView isn't fast enough. By bit-banging, I could toggle an output at around 30kHz at best, so not quite up to audio sample rate speed. Nevertheless, it might prove useful sometime. Instead of handing off bytes to the dll, passing a pointer to a buffer of data would probably do it, the dll ( modified to handle arrays)
    being written in a faster language. The source code is available.

  • bob_g4bbybob_g4bby Posts: 412
    edited 2021-05-08 10:51

    The subject of scheduling of tasks in a radio is a pretty fundamental issue. In the past, with a 1 or 2 core processor it was pretty easy to partition the work. Now the P2 (and other micros) with more than 2 cores make the best approach less obvious.

    A software radio processes a stream of IQ data from the receiver front end or the microphone (or it's equivalent). A sample of IQ data must be processed every X uS or a buffer of IQ data must be processed every Y mS My friend Ron G4GXO, with a dual core DSPIC33, fixed the allocation of radio tasks. Core 1 did all the signal processing, core 2 did all the non-signal processing. This was so that Core 1 completed the signal processing in time. If it did not, the radio would be useless. So - a radio is a mixture of tasks that must complete in a timeframe and tasks that don't have to complete in that timeframe, but must complete with varying levels of urgency.

    In using an 8 core processor like the P2, the same approach as with the DSPIC33 could be taken, a fixed task allocation scheme. N cores must complete the signal path chores in time for the next 'drum beat', 8-N cores cycle round a number of non-real time tasks as fast as they can.

    The question is: Would a dynamic (but keep it simple) scheduling mechanism be 'better' than static allocation of tasks? I read a little about scheduling in this paper

    Scheduling of tasks perhaps is not the key issue: What really matters with a radio is the scheduling of data and when it is ready for the next step. The computation of the signal through a number of stages in the right order in the small space that is the on-board ram.

    I like using Taqoz - it's good for experimentation. I wonder idly if all N 'free' cores could be made to check a task table and, like a well organised team, pick up, execute and tick off the tasks in a self-balancing fashion, all handling both time critical and non-critical tasks? If this allowed the user to continue to interact with Taqoz too, that would allow the radio operation to be inspected as it was running.

  • bob_g4bbybob_g4bby Posts: 412
    edited 2021-05-16 13:23

    Multiple cores sharing a common set of data acting together to process a time-critical signal path implies:-

    1. The cores that work together to achieve the signal path will have to be regularly resynchronised with each other, so that signals are processed in the right order - e.g. array variables are likely to be reused over and over again to economise on space.
    2. If more than one core is writing to a variable, then that variable will need protection from any race condition, so that all processors always agree on the new value

    Reading the Propeller 2 Documentation, both of the above may be helped by using the up to 16 Locks (sometimes called semaphores) described on page 67.

    To economise on the limited number of Locks available, if there are many shared variables, then these can be grouped, each group being protected by one Lock. Alternatively a section of code could be locked, so that a common resource was assigned to one cog until released.

    Regular synchronism of cores, could be achieved by setting required bits in a 'Rendezvous' byte for the cores required to sync. Each bit of the byte would signal that the corresponding core had reached the synchronising 'rendezvous' in it's program. (core 0 resets bit 0 etc.) Once the byte was all zero, then all cores would set off running their programs - a 'starting gun' or 'drum beat' kind of signal. This is a feature I have pinched from the LabView language, in which multiloop programs are very easy to write, but where needed the rendezvous feature keeps 'em in sync.

    Taqoz does have COGATN and POLLATN, but doesn't provide the Lock or WAITATN feature - but using the Assembler, they were easy to add

  • bob_g4bbybob_g4bby Posts: 412
    edited 2021-05-16 13:20

    My sdr will have a direct conversion receiver with a tayloe detector that delivers an IQ analogue signal in the band 0Hz-~100kHz. After a LP filter, a cheap PCM1808 ( ebay has 1000s ) stereo codec outputs an IQ digital stream, 24 bits per sample. The PCM1808 requires upto 20 MHz clock which could be provided by a smartpin or one channel of an SI5351 clock generator. The pcm1808 data interface is I2S, maybe in master mode with a smartpin receiver with sample rate somewhere 16 - 100ks/s,. The stream will be stored in two buffers, which alternate, one for writing by the i2s receiver, one for reading by the receiver dsp. Once an array is full or maybe full enough, ATN flag(s) would be raised to start waiting cogs.

    The cog handling the receiver input i2s may also have time to do microphone and speaker channels via smartpin a/d and pcm converters.

    Thinking forward to writing dsp words in inline assembler within Taqoz, it's probably preferable to store I and Q samples in adjacent array addresses. Since indirect addressing is used to index arrays, this means the use of just increment or decrement to access an IQ sample. Array size also tbd. From experience with PC based sdr, the longer the array, the lower the cpu load. However hub ram is limited, so that will limit the maximum array size and there will be much use of [A]=function([A],[B]) to economise on space. If (say) 1024 IQ samples are stored per array, then each is 8kbytes in size.
    Not sure what use Taqoz makes of LUT RAM. If this is unused, it's useful extra space and setup for pipeline between cogs. LUT is too small for IQ arrays would be useful for local variables, or constants like filter responses. As LUT ram totals 4k longs across 8 cogs and it's faster than hub ram, it's a good resource to tap into.

    A timing plan will be essential to divide the signal processing up between the minimum number of cogs, the task being a little like using a project management tool to minimise project time. The tool predicts when each task has to start and all the dependencies feed into the critical path to meet the end time. ( a little before the next array of codec samples is ready ) Another analogy is those patterns of upright dominoes that you used to set up and knock down - it was all self-sequencing.

Sign In or Register to comment.