Trice Speed
(Read only you are interested in)
A TRICE
macro execution can be as cheap like 3-4 Assembler instructions or 6-8 processor clocks:
- Disassembly:
- Measurement: The blue SYSTICK clock counts backwards 6 clocks for each
TRICE
macro (on an ARM M0+), what is less than 100 ns @64 MHz MCU clock:
A more realistic (typical) timing with target location and ยตs timestamps, critical section and parameters is shown here with the STM32F030 M0 core:
The MCU is clocked with 48 MHz and a Trice duration is about 2 ยตs, where alone the internal ReadUs() call is already nearly 1 ยตs long:
Target Implementation Options
All trice macros use internally this sub-macro:
#define TRICE_PUT(x) do{ *TriceBufferWritePosition++ = TRICE_HTOTL(x); }while(0); //! PUT copies a 32 bit x into the TRICE buffer.
The usual case is #define TRICE_HTOTL(x) (x)
. The uint32_t* TriceBufferWritePosition
points to a buffer, which is codified and used with the trice framing sub-macros TRICE_ENTER
and TRICE_LEAVE
in dependence of the use case.
Trice Use Cases TRICE_STATIC_BUFFER
and TRICE_STACK_BUFFER
- direct mode only
- Each singe trice is build inside a common buffer and finally copied inside the sub-macro
TRICE_LEAVE
. - Disabled relevant interrupts between
TRICE_ENTER
andTRICE_LEAVE
are mantadory forTRICE_STATIC_BUFFER
. - Usable for multiple non-blocking physical trice channels but not recommended for some time blocking channels.
- A copy call is executed inside
TRICE_LEAVE
.
- With appropriate mapping a direct write to physical output(s) is possible:
- RTT0 without extra copy.
- With
TRICE_DIRECT_SEGGER_RTT_32BIT_WRITE
about 100 MCU clocks do the whole work, what is within 1.5 us @ 64 MHz.
- With
- AUX without extra copy.
- Not (yet) supported UART transfer loop with polling. With 1MBit baud rate, 4-12 bytes would last 40-120 ยตs.
- RTT0 without extra copy.
Trice Use Case TRICE_DOUBLE_BUFFER
- deferred mode, fastest trice execution, more RAM needed
- Several trices are build in a half buffer.
- No stack used.
- Disabled interrupts between
TRICE_ENTER
andTRICE_LEAVE
. - Usable for multiple blocking and non-blocking physical trice channels.
- No copy call inside
TRICE_LEAVE
but optionally an additional direct mode is supported.
Trice Use Case TRICE_RING_BUFFER
- deferred mode, balanced trice execution time and needed RAM
- Each single trices is build in a ring buffer segment.
- No stack used.
- Disabled interrupts between
TRICE_ENTER
andTRICE_LEAVE
. - Usable for multiple blocking and non-blocking physical trice channels.
- No copy call inside
TRICE_LEAVE
but optionally an additional direct mode is supported. - Allocation call inside
TRICE_ENTER