Dark Silicon

Dave Hein · 2015-04-09 14:13

I was surprised to hear from Chip's update that the DE2-115 will only support about 12 cogs. So it seems like the new FPGA board is needed to implement a complete P2 with 16 cogs and 512K RAM.

Ken Gracey · 2015-04-09 16:47

jmg wrote: »

I doubt the FPGA board was ever there as a revenue stream - the purpose of that is to prove P2 designs before they run silicon, as well as seed future sales.

This is absolutely correct. I've been involved in several FPGA boards at Parallax. This board is an expense to Parallax and purely exists to test out the P2, to support early adopters and tool developers.

If you want to run the numbers, we've had this particular product in development for a year now. You can safely say we're in $100,000 before the first sellable FPGA board comes off the PnP. As a hobbyist or engineer who may work on your own, you may wonder how a product could possibly cost so much to develop. Well, it does. Once the steps are listed out it becomes very clear.

You can reverse engineer the BOM cost and you will see that the parts are probably $350. Add in manufacturing labor of $50 and you'll be at $400. If we sold each one at $500 and had a gross profit of $100/board, we would need to sell 1,000 boards simply to recover our investment. Realistically we will sell 50-100 of these boards.

As you clearly can see for yourself, the purpose of this board is not to generate revenue or profit.

The real benefit is to bring our development team (you, you and the rest of you) closer to our design. We can start the tools early, validate the Verilog before production, and catch any other surprises which are far more expensive if there's a failed fabrication run.

Ken Gracey

evanh · 2015-04-09 17:01

Thanks Ken, that does make things clear. It looks a very nice board by description and I hadn't considered anyone's costs nor final pricing. Like the DE2-115, it isn't likely to sell many units just for hobbyist experimenting.

jmg · 2015-04-09 17:33

Ken Gracey wrote: »

. Realistically we will sell 50-100 of these boards.

Makes sense as a short term goal, but you may find a little upside on that volume as
* This uses the largest CV that WebPack can support (from what I could find at Altera)
* Chip reports quite fast download times
* There are teaching and lab/development situations where those two items alone will matter.

I'm not sure if Altera give better prices for Development Board suppliers, but some of the prices suggest they do.
Could be worth visiting Altera when this is closer to full release to find out.

jac_goudsmit · 2015-04-10 07:28

cgracey wrote: »

I hope that in two months' time, I'll have an FPGA image ready. We are making a final PCB for our Cyclone V -A7 board now. We've already proven it and developed its loader which uses 2Mbps FTDI USB serial talking to a Prop 1. It loads about 35x-70x faster than Quartus' built-in programmer (3 seconds to load straight into the FPGA, 6 seconds to load into flash for cold booting).

If I may ask, how long does it take to compile a Cyclone V -A7 FPGA image for the P2?

At least for the P1V code, it seems to me that there is a ridiculous difference in compile time between Cyclone IV and Cyclone V (about 3 times as long?). If I would be working with FPGA every day, that would make me avoid the Cyclone V like the plague and develop with the Cyclone IV until I ran out of space...

===Jac

cgracey · 2015-04-10 13:19

jac_goudsmit wrote: »

If I may ask, how long does it take to compile a Cyclone V -A7 FPGA image for the P2?

At least for the P1V code, it seems to me that there is a ridiculous difference in compile time between Cyclone IV and Cyclone V (about 3 times as long?). If I would be working with FPGA every day, that would make me avoid the Cyclone V like the plague and develop with the Cyclone IV until I ran out of space...

===Jac

I'm hoping that this is not the case. I've done small compiles lately for Cyclone V test code and it seemed to compile as fast as Cyclone IV. I have not done a big compile in a while for Cyclone V, but I'm hoping that Altera fixed whatever the problem had been for slow Cyclone V compilation. You're right - it's maddening to wait for so long for a compile. We need the Cyclone V, though, for its logic and memory capacity. It's the least-expensive way to get what we need.

Cluso99 · 2015-04-10 17:45

Ken Gracey wrote: »

This is absolutely correct. I've been involved in several FPGA boards at Parallax. This board is an expense to Parallax and purely exists to test out the P2, to support early adopters and tool developers.

If you want to run the numbers, we've had this particular product in development for a year now. You can safely say we're in $100,000 before the first sellable FPGA board comes off the PnP. As a hobbyist or engineer who may work on your own, you may wonder how a product could possibly cost so much to develop. Well, it does. Once the steps are listed out it becomes very clear.

You can reverse engineer the BOM cost and you will see that the parts are probably $350. Add in manufacturing labor of $50 and you'll be at $400. If we sold each one at $500 and had a gross profit of $100/board, we would need to sell 1,000 boards simply to recover our investment. Realistically we will sell 50-100 of these boards.

As you clearly can see for yourself, the purpose of this board is not to generate revenue or profit.

The real benefit is to bring our development team (you, you and the rest of you) closer to our design. We can start the tools early, validate the Verilog before production, and catch any other surprises which are far more expensive if there's a failed fabrication run.

Ken Gracey

Ken,
I am pleased you posted this. As you know, I am well aware of design costs involved.
However, most (hobbyists and professions that haven't been involved in semi-complex developments) have no understanding of the costs involved.

Hopefully, the FPGA board will get enough exposure for others to see how powerful it is, and maybe add another market to sell more boards.

Bill Henning · 2015-04-10 20:27

Sounds great!

Is there one lookup for the while P2, or one per group of X pins?

I am looking forward to the new FPGA images... my DE2-115 and DE0-Nano's are dusty

All that memory bandwidth and hubexec will be a blast to play with!

cgracey wrote: »

Things are coming together well for the Prop 2.

Last night I got the new transfer/DDS block finished and I hooked it into the cog. It shuttles bytes/word/longs of I/O pin data to/from hub RAM at up to 32 bits per clock. It also drives DACs at those rates and performs DDS/Goertzel operations. It uses a 256x32 look-up RAM for outputting pixel-type and DDS/Goertzel data. All cogs can utilize the full bandwidth without affecting the others. I need to thoroughly test it now and then get onto the next things: hub execution (not much code, but challenging to think about), hub-based CORDIC (straightforward), and smart pins (not hard, but rather open-ended).

I hope that in two months' time, I'll have an FPGA image ready. We are making a final PCB for our Cyclone V -A7 board now. We've already proven it and developed its loader which uses 2Mbps FTDI USB serial talking to a Prop 1. It loads about 35x-70x faster than Quartus' built-in programmer (3 seconds to load straight into the FPGA, 6 seconds to load into flash for cold booting). Our -A7 board will support all 16 cogs and 512KB hub RAM. The DE2-115 will fit ~12 cogs and 256KB hub RAM. All this memory and I/O bandwidth, plus hub exec, is going to be really fun to work with.

Bill Henning · 2015-04-10 20:31

Ken,

Trust me, I understand the costs involved.

Personally, I will stick with my DE2-115, DE0-Nano's, and Be-CV's...

I do believe wifey would kill me (slowly, painfully) if I went "Honey... I need another FPGA board for developing for the P2"

Bill

Ken Gracey wrote: »

This is absolutely correct. I've been involved in several FPGA boards at Parallax. This board is an expense to Parallax and purely exists to test out the P2, to support early adopters and tool developers.

If you want to run the numbers, we've had this particular product in development for a year now. You can safely say we're in $100,000 before the first sellable FPGA board comes off the PnP. As a hobbyist or engineer who may work on your own, you may wonder how a product could possibly cost so much to develop. Well, it does. Once the steps are listed out it becomes very clear.

You can reverse engineer the BOM cost and you will see that the parts are probably $350. Add in manufacturing labor of $50 and you'll be at $400. If we sold each one at $500 and had a gross profit of $100/board, we would need to sell 1,000 boards simply to recover our investment. Realistically we will sell 50-100 of these boards.

As you clearly can see for yourself, the purpose of this board is not to generate revenue or profit.

The real benefit is to bring our development team (you, you and the rest of you) closer to our design. We can start the tools early, validate the Verilog before production, and catch any other surprises which are far more expensive if there's a failed fabrication run.

Ken Gracey

jmg · 2015-04-10 20:39

Bill Henning wrote: »

Is there one lookup for the while P2, or one per group of X pins?

Based on earlier postings, and the "I hooked it into the cog" tag, the 256x32 look-up RAM is per COG
There are use cases where multiple streamer-engines would be needed, like reading from fast serial memory, and outputting that (wider & via Table) to a LCD.
Another table could be driving Cos/Sine micro-step info to a Stepper motors via 2 DACs.
The decision of if all 16 COGS get this may come later, if die room is tight. 8 or 12 chans could be also ok ?

Bill Henning wrote: »

I do believe wifey would kill me (slowly, painfully) if I went "Honey... I need another FPGA board for developing for the P2"

That's easy - just call it something else

evanh · 2015-04-10 20:42

Lol, best to avoid such slippery situations.

mindrobots · 2015-04-10 20:47

Bill Henning wrote: »

"Honey... I need another FPGA board for developing for the P2"

Be honest, don't mislead her! You have the boards you need to develop for the P2, the FPGA 1-2-3 is so you can develop for the P3, you are just purchasing it early to avoid not being able to get one or the impending cost increases.

P.S. Let us know how that goes!

David Betz · 2015-04-11 04:47

mindrobots wrote: »

Be honest, don't mislead her! You have the boards you need to develop for the P2, the FPGA 1-2-3 is so you can develop for the P3, you are just purchasing it early to avoid not being able to get one or the impending cost increases.

P.S. Let us know how that goes!

P3? Do we really think that an FPGA board produced now will be what we'll want for a chip that isn't likely to even be started for another year or two? Won't technology have moved significantly in that time as well as expectations for what should be in a P3? Likely, the 1-2-3 board won't be able to fit the entire P3 just like the DE2-115 board now won't fit all of P2.

Heater. · 2015-04-11 05:19

Good point. At the current rate of progress the P3 is at least 10 years away. By that time we will be expecting thousands of processors per device.

mindrobots · 2015-04-11 05:53

Heck guys, I was just tryin' to help a fella out in getting something approved by the exchequer!

I'm hoping the P3 needs a big FPGA with an ARM SOC onboard!

Heater. · 2015-04-11 06:30

It's an investment in the future well being of the family. No really...

I'm hoping that RISC V has usurped the ARM in a decades time

Bill Henning · 2015-04-11 08:20

jmg:

256x32 per cog would be great, but I think 2 - 4 LUT's would be enough if it is tight.

She knows what an FPGA is...

evanh:

Yep!

Rick:

LOL! She would say P2 is not here yet, no reason to buy for P3.

Heater:

I read about Risc V ... sounds interesting.

All:

Thanks guys!

Ramon · 2015-04-11 21:12

David Betz wrote: »

Likely, the 1-2-3 board won't be able to fit the entire P3 just like the DE2-115 board now won't fit all of P2.

The main concern should be whether the P2 verilog will fit on a 7x7 mm square at 180 nm geometry, and NOT whether the P2 verilog will fit on a 28x28 mm Cyclone V (A7) at 28 nm !!!. If we don't keep this in mind we will get straight to the same point when Chip said : "hey guys, it looks that the IC gets really hot and takes 5 watts so now we are thinking of BGA ...". ($0.02)

Mike Huselton · 2015-05-08 10:35

cgracey wrote: »

David Betz sent me some things to read about the RISK V project that were interesting. Linking from those resources, I started reading about what has been termed "Dark Silicon", which is an outcome of processes shrinking faster than energy-per-function, so you can practically only clock a portion of your silicon at full speed, due to heat buildup or shear power constraints. In 8 years' time, only 1/16 the area of a leading-edge chip will be able to clock at full speed. This leads to interesting future design considerations. Here is a paper about it:

http://darksilicon.org/papers/papers/taylor_dark_silicon_horsemen_dac_2012.pdf

I'm BAAACK! I need my Parallax fix, bad. Thanks for the Dark Silicon tip. Tech articles assumed you were hip to this stuff. Only the cool kids, please, everybody else stand back.

Mike Huselton · 2015-05-08 10:46

Stacked chips (3D) with graphene cooling AND RISC V. Now we're talking. Military is currently using this tech; 3 to 5 years for commercial apps.

I will try to find the DARPA announcement (I hope this is not delusional thinking again, tehee...)

Heater. · 2015-05-09 05:39

Mike Huselton,

Stacked chips (3D) with graphene cooling AND RISC V. Now we're talking. Military is currently using this...

That might remain "dark silicon" within the military for a while.

In the meantime LowRISC http://www.lowrisc.org/ is planning to get it's 64 bit RISC V chips in our hands on dev boards within a year. Complete with little COG like "minion" cores for peripheral interfacing.

koehler · 2015-07-05 21:44

Heater,

Take a look at http://www.lowrisc.org/ again. All sorts of cores appear to be coming along, some only days away apparently. Would be interesting to see if someone does a lowRISC vs P1 comparison on an FPGA.

Heater. · 2015-07-05 22:12

koehler,
Thanks for the heads up there. I had forgotten to check back on how RISC V is doing.

koehler · 2015-07-06 04:59

Can't believe its been close to a year. There seems to be one heck of a lot of work going on, by multiple parties. Its like .bin-mas!

jmg · 2015-07-06 05:10

Take a look at http://www.lowrisc.org/ again. All sorts of cores appear to be coming along, some only days away apparently. Would be interesting to see if someone does a lowRISC vs P1 comparison on an FPGA.

Wow, mention of a clockless version, and another variant where "clock frequency adapts to track the voltage ripple".The self-tracking parts are less deterministic, as they run at whatever speed the PVT allows, but they can save energy in getting to idle states faster.

Heater. · 2015-07-06 05:13

The RISC V has certainly inspired a lot of action with the academic groups around the world. They need an open instruction set with open and usable implementations. As someone pointed out there was a lot of research and papers written where the DEC Alpha was used. All lost now as that ISA is just not available. They don't want to be beholden to Intel or ARM either.
I just hope there is also action going on with people who actually want to build chips with it. LowRisc sounds great, it also sounds very small scale though.

koehler · 2015-07-06 22:51

Interesting slide deck, https://speakerdeck.com/asb/an-update-on-lowrisc

Core roadmap - Q3 '15 Untethered SOC FPGA Design, and Minion Cores
- Q1 '16 Dual-core test chip with memory, Phy, Minions, 40/28nm
'16-'17- First volume run
I think the below belies how much interest there really, really is by some big players. I'm expecting a lot of noise and fireworks as we towards the end of the year.
With the sonic boom starting either late Q1/early Q2 when they get some real chips.
Note to self, don't plan on buying Samsung/Qualcomm stock in the future. Maybe Intel too.

"State of the RISC-V Nation: many companies ‘kicking the tires’. If you were
thinking of designing your own RISC ISA for project, then use RISC-V. If you
need a complete working support core today then pay $M for an industry core.

If you need it in 6 months, then consider spending that $M on RISC-V
development."

Dark Silicon

Comments