The embedded salt mines

We managed a fair amount of embedded work from 1980 to 2008. It never became easier; reduced visibility, reduced delivery periods, reduced rewards, larger code bases, more complexity—the list is long!

Besides the code to give the hardware life, the additional tools for each project were surprisingly complex. Perhaps these will be documented by migrating LaTeX documentation to DocBook or similar format and give users the option of high-resolution/ low-resolution photographs. If we were to start over again, the format would be a “Wikipedia” type of front-end. Apple provides tools and a server, which we will explore if self-hosting ever becomes affordable in Australia.

Box of eval boards

Figure 1. A small snapshot of evaluation boards required for different customers. Most are fairly new. If you scan through our news pages, you will see a move away from Luminary Micro/ Texas Instruments towards Atmel, Energy Micro, and then to NXP, and back to Freescale.

By the way, there are other drawers full of promising hardware that was never fully debugged before the next PowerPoint device hit the customers' request list. MIPS, PowerPC, ARM devices of various persuasions, FPGAs, — so many dollars taken away from wine and holidays. We are getting more cynical and picky as we stumble through the promises by “moonlighting science-fiction journalists” posing as knowledgable perveyers of blazing silicon. The herd seems to have decended on the ARMv8 camp. These devices are nowhere to be found outside of Apple or PowerPoint slides. You cannot even get a datasheet almost two years after license agreements were signed, and these same people “allow” you to pre-order expensive evaluation boards without even meeting their shipment dates by a year. (APM and AMD — we will watch with keen interest how this unfolds, as we cannot believe we are alone is being disappointed). Cavium did not even make the software SDK available as promised, but they will be forgiven a little longer, as their hardware is rather awesome, but why offer several families without the hope of delivering one in 2014?


The Linux kernel’s file system needs directory-wide (and recursive) “diffs” with snap shots to a database, trace collection storage, include file searches for macro definitions, etc. We wrote a lot of these ourselves, but they are command-line driven and written in vanilla C. Will we post them? Maybe, but we need to document them first before they have any value. In time....

The posts will be from a documentation viewpoint. If you think there is additional value in the tools, contact us for a discussion. In the past we generally provided complete source code with each project, so if somebody else would like to “care for and feed” some of these tools, we would be pretty happy.

Embedded tools worth using are surprisingly expensive. Atollic and IAR want almost half a vehicle for their toolchains. ARM are opportunistic with their annual charges for the ARMv8 compiler toolchain, and then even more for their trace port that they claim can handle any new multi-core device. We are battling with the ETM in single core (and slow) devices with debug sessions wading into the weeds and swamps.

We have found the Rowley Associates CrossWorks toolchain the best value and able to run on Linux, Windows and Apple hardware. Their support is also excellent, and yes, we have actually used their support. The same cannot be said for the silicon vendors' forums. The bad experience with the Texas Instruments Sitara Industrial Communications Engine and lack of source code for the “out of the box” serious evaluation was the reason why we no longer use their devices. The change of heart in allowing developers (other than the million a month folk) to buy OMAP devices when it looked as if they were about to pull the plug does not bode well for small development companies. Although we seldom use Atmel, their support and quality of available code must surely be the best, plus their Atmel Studio software is really good. We unfortunately need to run many different vendors' devices, and also use the Rowley Associates toolchain to program Atmel devices.

After the second day of opening any box, the startup code needs plenty of modification. We have written software to extract startup vectors to generate XML code which can then be used to generate something we can work with on most of the ARM Cortex M family. We have not released it due to copyright notices that each vendor has placed on their startup examples and function names. We steer away from ARM CMSIS with their nested include files. This is usually where you will find the pot holes as the nice simple “hello world” with the simplistic startup does not work when creating a new project from scratch. There are hidden “code read protect” values set in linker files that are not visible in recursive searches through all source in a sample directory tree. We have also written code to do that since the days of using WindRiver VxWorks with their nested include files and redefining everything. (Almost twenty years ago, so you would have thought the industry would learn before all unquestionally adopting CMSIS and very verbose single bit initialisation of ports. Embedded targets for high volume would write all 32-bits at a time, not 32 separate function calls to do the same thing).

Portability across a family

When launched, the Kinetis range promised identical address maps. Portability is really important for contractors working on different devices every week, but the advent of 250 plus page include files (that are unsuitable for a ‘diff’ comparison) negates much of that promise. The include files also get updated quicker than the silicon is fixed.

Freescale K60 board photo

Figure 2. Freescale’s K60 Tower board. In August 2011, and May 2011 news, we spent a few months exploring the modules in the Kinetis CPU. We bought the Rowley Associates Crossworks toolchain and JTAG probe.

The target was more useful for software evaluation, as the serial connection board was difficult to probe. The BGA package was going nowehere either, so probing anything under it was impossible. After a reasonable evaluation, we packed it up and put it in the drawer with a load of other evaluation boards. Then came the really low-cost Freedom K64 board.

Freescale Freedom K64 board

Figure 3. Freescale's Freedom K64 board. We bought a couple to try out. The amount of Flash, RAM and pinouts in the 100 pin LQFP package was perfect for low-cost applications where software development was going to dominate in pricing. We previously bought NXP LPC11C24 boards with a CAN port, but the isolation costs and limited Flash/ RAM were not worth serious investment in time. The “code read protection” problems with the NXP parts also slowed development.

What would we like changed? It would have been more useful to use external debug probes rather than the MBED version, plus to disable the on-board debug, you need to cut tracks but the documentation is unclear (differs in places and the circuit diagram would suggest more than one track needs to be cut). Nice board and minor problems in the silicon (identifies as a K62 and we do not see 256kBytes of SRAM — maybe the stack was put in the bit-band location?).

Pounce Kwikstix LCD

Figure 4. Freescale KwikStix board with K40 processor, showing LCD with all segments enabled.

KwikStix K40 board

Figure 5. Chip side of the KwikStix boards which we used to program under Windows before the Seggar drivers ran under Mac OS X.

We have several K40 boards, and when we “bricked” all our LPC11C24 boards, we had a schedule gap and decided to get the LCD driver running with our real-time kernel.

Diminishing Returns

We continually evaluate new architectures. At the moment we are working on some ARM processors from Freescale (shown above). What’s on the horizon? A few years ago we were holding our breath for the Xilinx ZYNQ FPGA with dual ARM hard-cores, but we don't think it lived up to its marketing claims. Certainly not on cost when looking at evaluation boards, perhaps the volume market did see value? For us, the PowerPC in Xilinx FPGAs were difficult to program compared to stand-alone devices from Freescale and AMCC. The small micro-controllers are useful for intelligent I/O, but we will use something else for more challenging work—like imaging and networking. We also waited for Freescale’s QorIQTM and Texas Instruments’ multi-core DSPs. They are already launching new devices which will take a while to drop off the PowerPoint slides — LayerScape, but the ARM 64-bit version.

What are customers looking at? We do not get advance notice, and often go to meetings where a group of engineers want knowledge on a particular part during an interview. Most ARM vendors have several hundred variants of a Cortex-M0 or M3/4, and each databook runs into hundreds or thousands of pages. When you do happen to know the actual device, the lead engineer tells you he cannot employ you for a project as he only has $50K a year. Even these projects are getting fewer and way off the leading edge.

Future Plans

We generally follow the money, and between payments, we indulge ourselves in red wine and evaluation boards. We are tired of the toys that do not have decent debug capability with JTAG ports that crash. Better efforts to address debug for software developers (no halting to read memory would help), vendors simply look at the lowest cost of sand to push onto the slaves in the salt mines. We really feel sorry for the next generation of developers, as we doubt it will improve with increasing cores and clock speeds (how can you possibly trace multi-cores over a single trace port that runs much slower than any of the cores?). Instrumentation has gone nowhere other than academic exercises.

Altera is several years late with their quad-core A53 devices — possibly due to ambitious plans following Intel's promises for foundary services at 14nm with their tri-gate. If Altera ever thought they were going to get priority over any Intel work, then they forgot to read up on the trainwreck left by the RISC marketing CEO's who adopted x86.

We will spend money on hardware that has networking capability, multiple cores (one used to debug the others, one for running a schedule, the others to do work), and 64-bit. 2014 does not look like the year to spend, as we are mid October when updating this page — still no AMD, APM or Cavium ARMv8 in sight even though vendors are claiming to be shipping evaluation boards; but without datasheets.