Starting with V2.0, Micronucleus is going to use an interrupt free modification of the software USB implementation V-USB. This provides significant benefits for the bootloader, as it is not necessary anymore to patch the interrupt vector of the user program. A surprising side effect was a speed up of the V-USB data transmission, which may also be helpful in other applications. Here, I try to give a rough overview about the meandering work that led to this achievement. Previous versions of Micronucleus (and also the Trinket bootloader) use an ingenious mechanism devised by Louis of embedded creations to patch the interrupt vector transparently to the user program. Although this approach works very well, it still adds a lot of complexity to the bootloader, will add a couple of cycles of interrupt delay, and carries the risk of breaking the user program in a few rare cases. Removing this burden allows for a drastic reduction in code size and improved robustness. V-USB uses a pin change interrupt on D+ to detect incoming USB transmissions. The interrupt routine will receive, decode and acknowledge USB data packets and store them in the RX buffer for parsing by the main program. In case outgoing data is requested from the host, the interrupt routine will respond accordingly if data is found in the TX buffer. The packet parsing and construction of outgoing packets is done in the main program by periodically calling usbpoll(). The idea of a polled or interrupt free V-USB was brought up by blargg in a posting on the V-USB forum. He also devised a pretty clever way to patch this modification into the existing V-USB code. His key insight was, that you can still use the interrupt system of the ATtiny when interrupts are disabled, by manually polling the interrupt flag register. The interrupt flag is set when the interrupt condition is met and will stay until it is manually cleared by the user. The following code snippet actively waits for the interrupt flag and then calls the normal interrupt handler to process incoming data. The only modification to V-USB is to disable interrupts (CLI) and to replace theRETIinstruction at the end of the interrupt routine inasmcommon.incwith aRET.
do {
if (USB_INTR_PENDING & (1<<USB_INTR_PENDING_BIT)) {
USB_INTR_VECTOR(); // clears INT_PENDING (See se0: in asmcommon.inc)
break;
} while(1)
Well, that looks pretty easy, and to our amazement it even worked on some computers, sometime. The problem is thatusbpoll()has to be called at some point, because otherwise the incoming USB transmissions are never parsed and nothing can be done with the data. A single call tousbpoll()takes about 45-90 us on a 16.5 MHz ATtiny. Since we do not poll for interrupts during the function call, no incoming data can be received. A first approach to solve this was to define a timeout period and only callusbpoll()when no incoming data was detected for a certain amount of time. This improved the functionality to a point where it was possible to upload and run programs with micronucleus. But again, it completely failed on some computers and was pretty unreliable in general. It became clear, that a more sophisticated algorithm was necessary to decide when to block the CPU and when it should be avoided.
I was already about to give up on the interrupt-free approach. But then I noticed that the new 1.1.18beta release of the Saleae logic analyzer software came with a USB1.1 protocol interpreter. This finally provided a tool to understand what was going on.
do {
// Wait for data packet and call tranceiver
do {
if (USB_INTR_PENDING & (1<<USB_INTR_PENDING_BIT)) {
USB_INTR_VECTOR(); // clears INT_PENDING (See se0: in asmcommon.inc)
break;
} while(1);
// Parse data packet and construct response
usbpoll();
// Check if a data packet was missed. If yes, wait for idle bus.
if (USB_INTR_PENDING & (1<<USB_INTR_PENDING_BIT))
{
uint8_t ctr;
// loop takes 5 cycles
asm volatile(
" ldi %0,%1 \n\t"
"loop%=: sbic %2,%3 \n\t"
" ldi %0,%1 \n\t"
" subi %0,1 \n\t"
" brne loop%= \n\t"
: "=&d" (ctr)
: "M" ((uint8_t)(10.0f*(F_CPU/1.0e6f)/5.0f+0.5)), "I" (_SFR_IO_ADDR(USBIN)), "M" (USB_CFG_DPLUS_BIT)
);
USB_INTR_PENDING = 1<<USB_INTR_PENDING_BIT;
}
} while(1);
Are we done yet? There are some minor other things that popped up:
- Handling of a bus reset is not efficiently done in usbpoll() anymore, since it is only called upon received a packet. Instead the detection of a reset also had to be moved into the main polling loop.
- SETUP and OUT are immediately followed by a DATA packet. V-USB has special code to handle this situation, however this detection sometimes failed when the gap between the packets was too long. This is not a problem with interrupts, because they can be “stacked”. In the interrupt-free case additional code had to be inserted before “handleSetupOrOut” in asmcommon.inc
- Since packets are received and parsed in order, it is not necessary anymore to have a double-buffered RX-buffer. Removing it saves some memory. You can find the full implementation in the testing branch of Micronucleus V2 right now. But be aware that this is an actively developed version, so things may change. So far, this implementation has been tested by multiple people and was found to be stable. Micronucleus V2 will be released once multiple-device support is done. Edit: Nice, looks like this made it to Hackaday! In light of that I’d like to add that Micronucleus V2 is not yet ready for release. If you just want a nice, small, bootloader for the ATtiny85, I would suggest you try the current release, Micronucleus V1.11.