Tag Archive: stm32f103

I would like to tell you about the various microcontrollers out there. There are many out there, and I can impossibly describe them all. However, I will point out some interesting ones, especially for hobby projects, which also means, affordable. Quite a while ago I wrote about the Teensy 2.0. This is an Atmel ATMega32u4 based microcontroller. This board is available for US$ 16.00 at the official store. For a single board, this is still affordable but if you want a bunch of them, it’s getting costly. So, I went looking on eBay for ATMega32u4 boards. Results show up, starting at US$ 2.15 (at least, today… back then… I don’t know what the price was) Well… it’s an Arduino Pro Mini clone. I’ve never looked at the Arduino platform, not then, not now. But don’t mind the Arduino in the title, it’s just an ATMega32u4 soldered to a board. But this gave me the idea to search for Arduino in stead, leading me to Arduino Mini Pro boards. These are ATMega328p chips soldered to a board. They start at US$ 1.86 at eBay and US$ US $1.50 at AliExpess.

Now… let’s have a look at the hardware. Both microcontrollers are part of the Atmel AVR family. This is an 8 bit microcontroller. The first time I came in contact with this family of microcontrollers was during my education at Fontys University of Applied Science. I did a project involving an AT90USB1287 microcontroller. This project involved USB communication, controlling an KS108 based LCD display, and passing data to an FPGA. This project is also the thing that made me distrust abstraction layers. The thing was, I’ve implemented the KS108 display, working as a charm, then I added the USB support, using a library provided by Atmel, and the display stopped working, the timing was way off. This all happened many years ago, 2008 or something, I don’t know. But the thing is…. it messed with my timings… and that made me distrust abstraction layers, and this is the reason why I keep away from the Arduino environment. It’s not specific against Arduino, but I’m just afraid such an abstraction is doing stuff behind my back, writing to some register values I am not aware off, breaking stuff I am trying to do.

Well… let’s have a look at the hardware. For both ATMega32u4 and ATMega328p we have a core that can run at multiple speeds. It can run up to 8 MHz when powered at 3.3 Volts and up to 16 MHz when powered at 5 Volts. The boards sold at eBay are generally configured to run at 5 Volts and are equipped with a 16 MHz crystal. A divisor can be configured, thus it is possible to run the board at a lower speed, thus operation at 3.3 Volt is possible. As setting this divisor is a run time operation, it will power up at 16 MHz in default state. However, an ATMEL AVR microcontroller has so called fuses, these are configuration bits that determine the power up sequence. One of the options is to start the MCU with a divisor of 8. This will bring up the board at 2 MHz, which is fine for 3.3 Volt operation. Teensy boards ship with this option enabled by default, however boards from eBay generally come with this option disabled.

To program an ATMEL AVR microcontroller, there are several options. A chip can have a bootloader installed, allowing it to program itself. The microcontrollers from the Teensy series, which USB, come with a proprietary boot loader. However, this bootloader implements the USB DFU protocol, the bootloader protocol according to the USB specifications, allowing it to be used with standard programming tools. Back at the project at Fontys, I had a microcontroller boards directly from AVR, which shipped wit the Atmel FLIP bootloader, another proprietary bootloader, also implementing the standard DFU protocol. The ATMega32u4 microcontrollers from eBay implement some USB CDC protocol. They enumerate as a serial port, implementing an Arduino-specific protocol. (Yuck…. use the fucking standard protocols, damnit). I don’t use the Arduino enviorement, and even though avrdude (a software to program Atmel AVR chips) is supposed to implement this protocol, I haven’t been able to communicate through this USB Serial port it provides. The ATMega328p based boards, obviously, don’t offer USB support as they lack an USB port. (There are boards out there which have an USB to serial converter on boards, but I am not considering those.) When using those with my own USB-to-serial-TTL-boards, I am able to program them using avrdude and the arduino protocol I mentioned.

There is also a way to program an Atmel AVR board using an external programmer, implementing the Atmel ISP protocol. There are open source projects to turn an ATMega328p into such a programmer. An ATMega328p programming another ATMega328p, quite nice to see the programmer is equal to the device being programmed. All the projects I’ve seen so far use the Arduino environment, so, this is the one and only time I’ve used the Arduino IDE, to compile an Atmel ISP programmer. The programmer is connected to an USB-to-serial-TTL board, and then avrdude is used to program the target board. The connection between the programmer board and the target goes over the SPI pins of both boards, and uses an Atmel proprietary protocol. One thing to keep in mind, the ISP protocol only allows programming, but not debugging.

Another search on eBay lead me to STM32F103C8T6 boards. eBay sellers also mention “Arduino” in their titles, but don’t be distracted by that. We are talking about an ARM. This is a ARM Cortex-M3 microcontroller, 20 KiB RAM, 64 KiB Flash, running at 72 MHz. And these boards are selling for US$2.13 at eBay, and even less at AliExpress, US$ 1.67 at the time of writing. Dirt cheap, and free shipping, no kiddin’. I would like to add a note about free shipping when ordering at AliExpress. It seems, when ordering 3 or more, they do charge shipping costs, but 1 or 2, they don’t. So I suggest to place multiple orders at different sellers if you like to order more then two boards.

To program these boards, an SWD programmer is required. (Stricly speaking, it is not, there exists a built-in bootloader, which enables programming over a serial interface, this requiring an USB to Serial (TTL level) adaptor. I have not tried this method). For about US$2, you can get a “ST Link V2” on eBay or AliExpress. (The price is similar…. but so is the hardware, another case of the same hardware being both programmer and target)

Basically, in the same price range, we have a 16 MHz 8 bit microcontroller and a 72 MHz, 32 bit microcontroller. So, I would say, this looks definitely interesting. Another thing to keep in mind, the SWD protocol does not only allow programming, but also debugging. SWD is an industry standard protocol, meaning many debuggers are available, ranging from the $2 to professional programmers which can cost a couple of hundreds, or even over a thousand dollars. But well, I’m talking about hobby usage so let’s stick to a $2 ST-Link for now. Programming and in circuit debugging.

There is much more to say about these chips, their programming environments, their debugging environments, properties of their architectures, etc. etc. There are more architectures and chips I would like to discuss, there is more coming in follow up articles, thanks for reading so far,



Okay, now, let’s implement a WS2812-controller using an STM32F103 microcontroller using DMA transfers. Again, we have a HAL implementation and a direct implementation. Basically, I switched to the direct implementation because I couldn’t find something in the HAL implementation, but I found it later.

Memory considerations. The example from the HAL uses 32 bit values. As I only need low values, 8 bit values would be fine, and needed to minimise the memory usage. As I couldn’t find the option to use 8 bit values, I switched to a direct implementation.

Now… the option is there… but I was looking where it was set to 32. So, I was looking for something mentioning 32. However, it’s called
hdma_tim.Init.PeriphDataAlignment = DMA_PDATAALIGN_WORD ;
It didn’t occur to me to seach for WORD to represent uint32_t. Basically, to me, WORD is a system-specific thing. Different platforms have different word sizes, so using WORD to indicate a 32 bit int doesn’t come natural to me.

Nevertheless, I continued with my direct implementation. I suppose, this way I get to know the hardware, and I am doing some things I believe are not available in the HAL, or they’re hidden so well, that it’s easier to find them in the datasheet.

So, there is a DMA controller. Bascially, I point to to a block of memory and a peripheral. The peripheral requests the next unit of data when it’s ready to process it. That’s basically how it works.

The DMA controller has channels. One thing to be aware of, each peripheral is associated with a certain channel. You have to use the correct channel or it won’t work.

So, let’s have a look at this DMA controller. To use the DMA Controller with Timer 2, we need DMA Controller 1, Channel 2. We need to associate the DMAR register of the Timer to the DMA Channel. The timer peripheral uses 16 bit values. As we have low values and want to conserve memory, the data buffer is 8 bit.

	DMA1_Channel2->CPAR = &(TIM2->DMAR); // DMA 1 Channel 2 to TIM2

	DMA1_Channel2->CCR  = 0x00;
	DMA1_Channel2->CCR  |= 	(0x01 << DMA_CCR_PSIZE_Pos); // Peripheral size 16 bit
	DMA1_Channel2->CCR  |= 	(0x00 << DMA_CCR_MSIZE_Pos); // Memory size 8 bit
	DMA1_Channel2->CCR  |=  (0x1 << DMA_CCR_DIR_Pos);   // Memory to Peripheral
	DMA1_Channel2->CCR  |=  (0x1 << DMA_CCR_MINC_Pos);   // Memory increasement
	DMA1_Channel2->CCR  |=  (0x0 << DMA_CCR_PINC_Pos);   // Peripheral increasement
	DMA1_Channel2->CCR  |=  (0x0 << DMA_CCR_CIRC_Pos);   // Circular mode
	DMA1_Channel2->CCR |= DMA_CCR_TCIE; // Enable transfer complete interrupt

Now, let’s look at the timer. Here, we say where the data should go that is offered by a DMA tranfer. We set it to go to the CCR1 register. The compare register that sets the PWM period. Each timer has 4 channels, and 4 of those registers. Here, I set the timer to receive 4 transfers at a time. This way, I output to all 4 channels at the same time. Furthermore, I have to enable the Update DMA request.

	TIM2->DCR |= (( 12 ) << TIM_DCR_DBA_Pos); // DMA Transfer Base address CCR1
	TIM2->DCR |= (( 3 ) << TIM_DCR_DBL_Pos); // 4 Transfer at a time (CCR1 to CCR4)
	TIM2->DIER |= TIM_DIER_UDE; // Update DMA Request Enable

When this is set up, a DMA transfer can be initiated by

	DMA1_Channel2->CNDTR = size;
	DMA1_Channel2->CMAR = memory;

	TIM2->CCMR1 |= 1; // enable timer
	DMA1_Channel2->CCR |= 1; // Enable DMA


Point to the memory block, set the length of the block, enable timer, enable DMA, and finally, let the timer request an update from the DMA controller. Now, each period the timer will send an update request to the DMA controller, and this way, we can control four LED strips simultaneously. And this is what mentioned before, I couldn’t find an option in the HAL to control multiple channels simultaneously.

So, I’ve decided to create my own controller for WS2812-compatible (SK6812, PD9823) leds. These leds are referred to as “clockless”, as they only have a data line and no clock line. The data is transmitted in a serial protocol, which encodes a zero as a short high, long low, and a one as long high, short low. A short high pulse should be less then 440 ns, and a long high pulse should be at least 625 ns, to cover most of the variants. Using this protocol, the data is transmitted as RGB colour data, 8 bit per pixel, thus 24 bits per LED. The order is rather GRB, for SMD leds as found on LED strips, but RGB for trough hole leds (such as PD9823). The data is transmitted Most Significant Bit First. Each led on a led strip reads 24 bits and applies that colour, and forwards the remaining bits to the next led in the chain.

Looking at that protocol, I imagined it could be implemented using a PWM generator, continuously updating the PWM period. I’ve decided to implement this on an ST microcontroller, the STM32F102C8T6. I had one of these laying around. I’ve ordered such a microcontroller on eBay, over a year ago, and never gotten into doing something with it, until now. I will write another post about microcontrollers soon, but now, I am writing about my experiences controlling the ws2812-style leds. This will include some implementation details specific to the STM32F103 microcontroller.

The STM32F103 microcontroller has timer units which include a PWM mode. You set a period time, and a compare time less then the period time. This will generate a PWM signal with said period and compare time. My first naive implementation was to set the next period time in the interrupt handler when the compare time expired. My first attempt is based on the examples provided with the STM32Cube SDK. It uses the HAL provided by ST. Obviously, the first attempt didn’t work. It never does. (I wrote this months ago… there might be some details off)

void pwm_init() {

  TimHandle.Instance = TIM2;

  TimHandle.Init.Prescaler         = 9; 
  TimHandle.Init.Period            = 10;
  TimHandle.Init.ClockDivision     = 0;
  TimHandle.Init.CounterMode       = TIM_COUNTERMODE_UP;
  TimHandle.Init.RepetitionCounter = 0;

  if (HAL_TIM_PWM_Init(&TimHandle) != HAL_OK) {
  /* Initialization Error */

  /*##-2- Configure the PWM channels #########################################*/
  /* Common configuration for all channels */
  sConfig.OCMode       = TIM_OCMODE_PWM1;
  sConfig.OCPolarity   = TIM_OCPOLARITY_HIGH;
  sConfig.OCFastMode   = TIM_OCFAST_DISABLE;
  sConfig.OCNPolarity  = TIM_OCNPOLARITY_HIGH;

  /* Set the pulse value for channel 1 */
  sConfig.Pulse = 8;
  if (HAL_TIM_PWM_ConfigChannel(&TimHandle, &sConfig, TIM_CHANNEL_1) != HAL_OK) {
  /* Configuration Error */

  // Clear Pending IRQ and Enable IRQ.


void HAL_TIM_PWM_PulseFinishedCallback(TIM_HandleTypeDef *htim) {
	uint32_t mask = 1 << bitcount++;
	if (pixelcount < 2) {
		// Set output
		sConfig.Pulse = (mask & data[pixelcount]) ? 8 : 3;

	} else {
		// reset
		sConfig.Pulse = 0;

	if (HAL_TIM_PWM_ConfigChannel(&TimHandle, &sConfig, TIM_CHANNEL_1) != HAL_OK) {
	    // Configuration Error

	if (bitcount == 24) {
		bitcount = 0;

	if (pixelcount == 4) pixelcount = 0;


void HAL_TIM_PeriodElapsedCallback(TIM_HandleTypeDef *htim) {
	//trace_printf("Callback on channel %u\n", htim->Channel);

In this example, I store each pixel, or led value, in a 32 bit integer. I run through a bitmask, going from bit 0 to bit 23, as I have a 24 bit colour value in there, and I set the pulse width of the next pulse accordingly to 3 or 8. Seems quite straight forwards. But it didn’t work. Well… when it doesn’t work… how to find what’s going wrong?

To me, the first approach would be to get rid of the HAL, and controlling the registers myself. I mean… so see what’s going wrong, you have to see what is going on. So, that would give me the following code: (again, I wrote this months ago… there might be some details off)

void TIM2_IRQHandler (void) {
	if (TIM2->SR &0b01) {
		TIM2->SR &=~0b01;
	  uint32_t mask = 1 << bitcount++;
	  if (pixelcount < 2) {
		  TIM2->CCR1 = (mask & data[pixelcount]) ? 8 : 3;
	  } else {
		  TIM2->CCR1 = 0;

	  if (bitcount == 24) {
		  bitcount = 0;
	  if (pixelcount == 4) pixelcount = 0;


void pins_init() {
  GPIO_InitTypeDef   GPIO_InitStruct;

  // Enable Timer 2 Clock

  // Enable GPIO Port A Clock

  // Common configuration for all channels
  GPIO_InitStruct.Mode = GPIO_MODE_AF_PP;
  GPIO_InitStruct.Pull = GPIO_PULLUP;

  // Apply pin configuration to PA0
  GPIO_InitStruct.Pin = GPIO_PIN_0;
  HAL_GPIO_Init(GPIOA, &GPIO_InitStruct);

  // Apply pin configuration to PA1
  GPIO_InitStruct.Pin = GPIO_PIN_1;
  HAL_GPIO_Init(GPIOA, &GPIO_InitStruct);

  // Apply pin configuration to PA2
  GPIO_InitStruct.Pin = GPIO_PIN_2;
  HAL_GPIO_Init(GPIOA, &GPIO_InitStruct);

  // Apply pin configuration to PA3
  GPIO_InitStruct.Pin = GPIO_PIN_3;
  HAL_GPIO_Init(GPIOA, &GPIO_InitStruct);

void pwm_init() {
	NVIC->ISER[0] |= 0x10000000;
	RCC->APB1ENR |= 1;
	TIM2->ARR = 10 ; // Reload Value
	TIM2->PSC =  9 ; // Prescaler
	TIM2->CCMR1 = ( TIM2->CCMR1  & ~(0b11110000) ) | (0b1101 << 3);  // Set Channel 1 to PWM mode 1 and enabling reload
	TIM2->CR1 |= 1 << 7; // auto reload enable
	TIM2->EGR |= 1; // Set UG bit
	TIM2->CR1 &= ~(0b1110000); // Edge aglined, upcounting
	TIM2->CR1 |= 0b100; // Event source, only over/underflow
	TIM2->DIER = 0x0001; // interrupt enable
	TIM2->CCER |= 0b1;  // output enable and polarity
	TIM2->CCR1 = 0; // output val
	TIM2->CR1 |= 0x0001; // enable

Basically, doing the same thing, but without the HAL. Here see directly what is going on with the registers we set. But of course this didn’t work either. Blaming the HAL is too easy. But let’s have a look at what is going wrong. (This is stuff I wrote months ago… at least at this point, I have some material describing what went wrong). Every bit I output is outputted twice.

Expected outbut

Actual output

Basically, the problem here is, the time I spend in the interrupt handler is too much. The time it takes to update the pulse width is longer then the time left in the period, this the value is not ready yet when the next period starts. Thus, the current value is outputted twice.

So, this means, I cannot calculate the next value on the fly. I need to have the values ready when I start outputting them. The timer hardware supports DMA transfers, so I could point it to a block of memory containing the values I need to output. However, doing so would need quite some more RAM. I will discuss details of this approach in a next post, thanks for reading.