Expand hardware drivers to 64 bits (address issue #2331) by andypugh · Pull Request #3932 · LinuxCNC/linuxcnc

andypugh · 2026-04-12T22:44:09Z

Mainly a re-working of the register wrapping to 64 bits.
Note that this does not yet include Hostmot2, though that has already been at least partially patched in 8a3ac3f

src/hal/drivers/hal_gm.c

src/hal/drivers/hal_motenc.c

src/hal/drivers/hal_ppmc.c

BsAtHome · 2026-04-12T22:55:38Z

src/hal/drivers/hal_ppmc.c

+      rtapi_s64 l;              // JE001 AP001
        struct byte_tag {
            signed char b0;
            signed char b1;
            signed char b2;
            signed char b3;
        } byte;


A rtapi_s64 has 8 bytes, not 4.

BsAtHome · 2026-04-12T22:57:49Z

src/hal/drivers/hal_ppmc.c

        /* check for - to + transition */
        if ((oldpos.byte.b2 & 0xc0) == 0xc0 && (pos.byte.b2 == 0))
-            pos.byte.b3++;
+            pos.l += 0x01000000;
        else
            if ((oldpos.byte.b2 == 0) && (pos.byte.b2 & 0xc0) == 0xc0)
-                pos.byte.b3--;
+                pos.l -= 0x01000000;


Byte-testing is a bad option. Better to look at the value as a whole and mask unwanted bits using a binary and (&). Then the compiler can sort it out properly.

BsAtHome · 2026-04-12T23:04:02Z

src/hal/drivers/hal_stg.c

@@ -616,7 +624,7 @@ static void stg_counter_capture(void *arg, long period)
 		stg->pos_scale[n] = 1.0;
 	}
 	/* scale count to make floating point position */
-	*(stg->pos[n]) = *(stg->count[n]) / stg->pos_scale[n];
+	*(stg->pos[n]) = *(stg->count64[n]) / stg->pos_scale[n];


Writing a hal pin and reading the same pin just a bit later is very inefficient because the compiler must physically read the value from memory. The volatile modifier prevents caching. However, the value does not change here between the two lines. Therefore, a local intermediary allows the compiler to be more efficient.

This issue is seen all over this and other code, writing and then shortly thereafter again reading same pin.

BsAtHome · 2026-04-12T23:07:16Z

src/hal/drivers/hal_ppmc.c

        /* check for - to + transition */
        if ((oldpos.byte.b2 & 0xc0) == 0xc0 && (pos.byte.b2 == 0))


What exactly is the counter's width and signedness? Masking two bits that must be set looks strange. If this two's complement, one's complement or another format?

The value read from the FPGA is an unsigned 16 bit number. So the positive wrap looks like:
0xFFFF
0x0000
0x0001
...
And the reverse wrap is
0x0001
0x0000
0xFFFF
....

But that means a simple unsigned or two's complement signed 16 bit value. Then the proper procedure would be the same as for the 32 bit case, but simply with a 16 bit sign-extended starting point:

rtapi_s32 oldval = _initval_; rtapi_s64 bigcounter = _initval_; ... rtapi_s32 newval = (((rtapi_s32)(signed char)MSB) << 8) + (LSB & 0xff); rtapi_s32 delta = oldval - newval; oldval = newval; bigcounter += delta;

The first MSB cast makes the value a signed type and then the next rtapi_s32 cast forces sign-extension.

And the assumption is that it does not wrap within a servo period (should be safe with just over 65 MHz at 1ms servo thread).

That could be interpreted as signed two's complement or unsigned.

BsAtHome · 2026-04-12T23:09:37Z

There are probably some more issues that need to be addressed, but my first list is probably a start ;-)

BsAtHome · 2026-04-12T23:11:59Z

BTW, the CI fail is due to a warning from the ini-file merge. That just got a fix which I merged in #3931. Rebasing/FF your code should fix that.

BsAtHome · 2026-04-12T23:36:56Z

src/hal/drivers/hal_gm.c

    int		i;
-	hal_s32_t	temp1 = 0, temp2;
+	rtapi_s32	temp1 = 0, temp2, delta_counts;
 	hal_float_t	vel;


Local variables must not use HAL types.

BsAtHome · 2026-04-12T23:39:20Z

src/hal/drivers/hal_gm.c

 		      }
 		      else
 		      {
 			if(temp2 > (device->encoder[i].last_index_latch + (hal_s32_t)(device->encoder[i].counts_per_rev/4)))


Wrong type in cast

BsAtHome · 2026-04-12T23:39:30Z

src/hal/drivers/hal_gm.c

 			{
 			  device->encoder[i].index_offset -=  device->encoder[i].last_index_latch + device->encoder[i].counts_per_rev - temp2;
 			}
 			else if(temp2 < (device->encoder[i].last_index_latch - (hal_s32_t)(device->encoder[i].counts_per_rev/4)))


Wrong type in cast

BsAtHome · 2026-04-12T23:44:57Z

src/hal/drivers/hal_gm.c

+		delta_counts = temp1 - device->encoder[i].old_counts;
+		device->encoder[i].old_counts = temp1;
+		*(device->encoder[i].rawcounts) += delta_counts;
+		*(device->encoder[i].rawcounts64) += delta_counts;


This looks fine. You do:

s32_deltacount = s32_counter - s32_saved_counter; s32_saved_counter = s32_counter; s64 += s32_deltacount;

I assume that the s32_counter is two-s complement. However, it should also work for an unsigned counter because the deltacount should always be result in correct two's complement.

The only assumption is that you cannot wrap a 32-bit counter in one servo period. But I guess we should be save there?

I think so. The Mesa counters max out at 20MHz, the internet suggests that 500MHz is possible.
Wrapping 32 bits in 1ms is a few Teraherz.

At 500MHz you have a comfortable 8.5 seconds before you wrap :-)

BsAtHome · 2026-04-13T07:19:12Z

src/hal/drivers/hal_ppmc.c

      byteindex = ENCTB;
      timebase.byte.b0 = (unsigned char)slot->rd_buf[byteindex++];
      timebase.byte.b1 = (unsigned char)slot->rd_buf[byteindex];


The timebase variable seems to be a 16-bit value and needs to be read from a stream in an array(rd_buf[]). The data looks to be in little endian (LSB in the lower address). We are running this code on a little endian machine. Then this method seems better (easier for the compiler to optimize):

unsigned short val; val = slot->rd_buf[byteindex + 1] & 0xff; val <<= 8; val += slot->rd_buf[byteindex + 0] & 0xff; // The byteindex seems to be overwritten subsequently, // so this line may be dead code byteindex += 1; // Only increments once in the code...

Alternate formulation:

val = (slot->rd_buf[byteindex + 0] & 0xff) + ((slot->rd_buf[byteindex + 1] & 0xff) << 8); // Same comment... byteindex += 1; // Only increments once in the code...

BsAtHome · 2026-04-13T07:20:27Z

src/hal/drivers/hal_ppmc.c

 	pos.byte.b0 = (signed char)slot->rd_buf[byteindex++];
 	pos.byte.b1 = (signed char)slot->rd_buf[byteindex++];
 	pos.byte.b2 = (signed char)slot->rd_buf[byteindex++];
-        pos.byte.b3 = oldpos.byte.b3;


Deletion of b3 leaves it undefined and at a random value.

BsAtHome · 2026-04-13T07:26:00Z

src/hal/drivers/hal_ppmc.c

 	pos.byte.b0 = (signed char)slot->rd_buf[byteindex++];
 	pos.byte.b1 = (signed char)slot->rd_buf[byteindex++];
 	pos.byte.b2 = (signed char)slot->rd_buf[byteindex++];


You are reading a value from a stream. The better method is to shift the value in byte-by-byte into the larger type instead of using a unonized type.

If the value you stream-read is a two's complement value, then there is a second advantage. You can do proper sign extension in one go by casting the MSB to a signed char and then to the signed larger type. Shifting left will preserve the sign.

andypugh · 2026-04-13T09:15:16Z

Note that in all cases I have attempted to touch the existing code as little as possible. I am not sure that fixing long-standing inelegancies is within the scope of this change.

BsAtHome · 2026-04-13T09:22:51Z

Note that in all cases I have attempted to touch the existing code as little as possible. I am not sure that fixing long-standing inelegancies is within the scope of this change.

Inelegancies besides, there are a lot of problems with the code. "Just touching the counter wrap" may actually break stuff in ways you didn't imagine because of the way it is written. The code is (very) fragile in my opinion. I see it as a pest verses cholera choice, you choose how to die ;-)

rmu75 · 2026-04-13T09:40:48Z

Given a 16bit hardware counter, suppose i have a reading of 16383, followed by a reading of -16384, how is the code supposed to know if a wrap occured in this situation or not? Obviously that wrapping can only work if the maximum allowed change between sampling is less than half of the total encoder counts. That should probably be documented somewhere, although it is probably unlikely to occur on real hardware.

BsAtHome · 2026-04-13T11:13:56Z

Given a 16bit hardware counter, suppose i have a reading of 16383, followed by a reading of -16384, how is the code supposed to know if a wrap occured in this situation or not?

That is a very good observation. The code has a problem at half the range disregarding directional information.

What about this:

// your example
rtapi_s32 oldval = 16383;
rtapi_s32 newval = -16384;
rtapi_s32 delta = newval - oldval; // delta is now -32767 (within range)

// Full 16 bit range forward
rtapi_s32 oldval = 32767;
rtapi_s32 newval = -32768; // moved one position
rtapi_s32 delta = newval - oldval; // delta is now -65535, should be +1

// Full 16 bit range reverse
rtapi_s32 oldval = -32768;
rtapi_s32 newval = 32767; // moved one position
rtapi_s32 delta = newval - oldval; // delta is now +65535, should be -1

// half-way problem correction
if(delta > 32768)  // -32768 -> 0 is valid
    delta -= 65536;
else if(delta < -32768) // 0 -> -32768 is valid
    delta += 65536;

It also means that 32767 -> -1 (or 32766 -> -2) will become -32768.

rmu75 · 2026-04-13T18:30:44Z

I don't like that bit manipulation stuff. IMHO it would be better to make that stuff explicit. Then we also would not have to assume anything about endianness or how exactly stuff is represented and could deal with arbitrary number of max counts, e.g. 18 bit absolute encoders.

lets assume a counter with 4000 counts going from 0 to 3999. midpoint would be 1999.

calculate delta of new value and old value, this would be in the range [-3999 to +3999]

wrap around occured if new value <= 1999 and old value > 1999 (or vice versa) and abs(delta) >= 2000.

in case abs(delta) < 2000 no wraparound. else

in case new < old, the delta will be negative, e.g. new = 100, old = 3900, delta = -3800. The correct delta in this case would be +200, so we need to add 4000.

in case old < new, the delta will be positive, e.g. old = 3900, new = 100, delta = 3800. The correct delta in this case would be -200, so we need to subtract 4000.

that same logic should apply no matter if counts are treated as signed or unsigned. the delta should be calculated in something large enough.

andypugh · 2026-04-13T19:41:34Z

I don't like that bit manipulation stuff. IMHO it would be better to make that stuff explicit.

Bear in mind that this is the PPMC driver, which gets the bytes one at a time through the EPP parallel port.
Endian-ness isn't an issue in that regard.

~~And the value is definitely unsigned 16 bit, this has been confirmed with Jon @ Pico Systems.~~
Sorry, I misremembered, it is 24 bit, unsigned. But definitely unsigned.

In the case of the other drivers, the values captured from the hardware are signed, and just doing "counts += delta_counts" works without any conditional execution even through the wrap, with the correctly-sized signed datatypes.

andypugh · 2026-04-13T19:51:59Z

I discussed the wrap-management techniques with Jeff Epler during the work on this PR.

"Here's the implementation from pluto, which takes a bit count argument

static inline int64_t extend(int64_t old, int newlow, int nbits) {
    int64_t mask = (1<<nbits) - 1;
    int64_t maxdelta = mask / 2;
    int64_t oldhigh = old & ~mask;
    int64_t oldlow = old & mask;
    int64_t candidate1, candidate2;

    candidate1 = oldhigh | newlow;
    if(oldlow < newlow) candidate2 = candidate1 - (1<<nbits);
    else                candidate2 = candidate1 + (1<<nbits);

    if (llabs(old-candidate1) > maxdelta)
        return candidate2;
    else
        return candidate1;
}

Way back in the day I tested it to my satisfaction. However it has a conditional branch in it which is not ideal.

It's better to do like you suggest, computing a delta value .. the trick to do it on an odd size like 24 is to shift the bits up so that the 24 bits are at the top of the integer to find the correct delta value from the new low bits.. I have to credit my time working with MicroPython for teaching me this idea, they need to do it for efficient math on 30 and 31 bit signed values.

#include <stdint.h>
#include <limits.h>
#include <assert.h>

#ifdef __CPROVER__
#define assume(x) __CPROVER_assume((x))
#define assert(x) __CPROVER_assert((x), #x)
#else
#define assume(x)
#include <assert.h>
#endif

int32_t nondet_s32();
int64_t nondet_s64();

static inline int64_t extend(int64_t old, int newlow, int nbits) {
    int nshift = 64 - nbits;
    uint64_t oldlow_shifted = ((uint64_t)old << nshift);
    uint64_t newlow_shifted = ((uint64_t)newlow << nshift);
    int64_t diff_shifted = newlow_shifted - oldlow_shifted;
    return (uint64_t)old + (diff_shifted >> nshift);
}

int main() {
    int nbits = nondet_s32();
    assume(nbits > 8 && nbits < 32);

    int mask = UINT_MAX >> (32 - nbits);
    int64_t old = nondet_s64();

    int delta = nondet_s32() & (mask >> 1); // Can handle deltas up to this big

    int64_t new1 = (uint64_t)old + (uint64_t)delta;
    int64_t res1 = extend(old, new1 & mask, nbits);
    assert(new1 == res1);

    int64_t new2 = (uint64_t)old - (uint64_t)delta;
    int64_t res2 = extend(old, new2 & mask, nbits);
    assert(new2 == res2);
}

This program is an input to cbmc which proves that the assertions hold for ALL values that match the assume() clauses. And this version is branch free and free of undefined behavior due to signed integer overflow: cbmc --signed-overflow-check extend2.c

Double check it for yourself because it's late here on a holiday night and I shouldn't be writing low level C code. But you successfully nerd-sniped me."

rmu75 · 2026-04-13T19:57:40Z

hmm. am i missing something? in case of two's complement the sign is convention. in 24 bit signed 8388607 + 1 wraps to -8388608, in unsigned it becomes +8388608, both have representation 10000000 00000000 00000000. extension to 32bit would differ though in the uppermost byte.

rmu75 · 2026-04-13T20:07:27Z

The first thing from jeff epler seems to be exactly what I had in mind. The second thing is quite clever, shifting avoids the need for sign extension...

I wonder given MicroPython targets microcontrollers if a branch on such architectures is really worse than 3 shifts... The problem with clever code is it is clever ;) and 5 years from now nobody knows anymore.

Expand hardware drivers to 64 bits (address issue #2331)

52c7fb8

andypugh marked this pull request as draft April 12, 2026 22:44

andypugh requested a review from BsAtHome April 12, 2026 22:44

Fix merge coommit oversight

8893513

BsAtHome reviewed Apr 12, 2026

View reviewed changes

BsAtHome reviewed Apr 13, 2026

View reviewed changes

		/* check for - to + transition */
		if ((oldpos.byte.b2 & 0xc0) == 0xc0 && (pos.byte.b2 == 0))

Conversation

andypugh commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BsAtHome commented Apr 12, 2026

Uh oh!

BsAtHome commented Apr 12, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andypugh commented Apr 13, 2026

Uh oh!

BsAtHome commented Apr 13, 2026

Uh oh!

rmu75 commented Apr 13, 2026

Uh oh!

BsAtHome commented Apr 13, 2026

Uh oh!

rmu75 commented Apr 13, 2026

Uh oh!

andypugh commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andypugh commented Apr 13, 2026

Uh oh!

rmu75 commented Apr 13, 2026

Uh oh!

rmu75 commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

andypugh commented Apr 12, 2026 •

edited

Loading

andypugh commented Apr 13, 2026 •

edited

Loading