[Deepin-Kernel-SIG] [linux 6.18.y] [FROMLIST] genksyms: Support arm64 CRC32 hardware acceleration#1759
Conversation
Reviewer's GuideAdds optional arm64 hardware-accelerated CRC32 support to genksyms, falling back to the existing table-based implementation when unavailable, and wires the detection into program startup. Sequence diagram for arm64 CRC32 hardware acceleration in genksymssequenceDiagram
participant Main
participant crc32_check_hw
participant partial_crc32
participant partial_crc32_one
participant crc32_hw_byte
Main->>crc32_check_hw: crc32_check_hw()
crc32_check_hw-->>Main: crc32_hw_available flag
Main->>partial_crc32: partial_crc32(s, crc)
alt [crc32_hw_available] on aarch64
loop for each byte c in s
partial_crc32->>crc32_hw_byte: crc32_hw_byte(c, crc)
crc32_hw_byte-->>partial_crc32: updated crc
end
else [hardware unavailable or non aarch64]
loop for each byte c in s
partial_crc32->>partial_crc32_one: partial_crc32_one(c, crc)
alt [crc32_hw_available] on aarch64
partial_crc32_one->>crc32_hw_byte: crc32_hw_byte(c, crc)
crc32_hw_byte-->>partial_crc32_one: updated crc
else [crc32_hw_available false]
partial_crc32_one-->>partial_crc32: crctab32-based crc
end
end
end
partial_crc32-->>Main: final crc
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- The inline assembly uses
.arch_extension crcandcrc32bdirectly, which may not be accepted by all aarch64 assemblers/toolchains; consider using compiler intrinsics (e.g.,__builtin_aarch64_crc32b) or a feature macro check to improve portability. - Introducing a hard dependency on
getauxval/HWCAP_CRC32for the host tool may break builds on non-glibc or older libc environments; it would be safer to guard this with configure-time checks or provide a fallback path whengetauxvalis unavailable.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The inline assembly uses `.arch_extension crc` and `crc32b` directly, which may not be accepted by all aarch64 assemblers/toolchains; consider using compiler intrinsics (e.g., `__builtin_aarch64_crc32b`) or a feature macro check to improve portability.
- Introducing a hard dependency on `getauxval`/`HWCAP_CRC32` for the host tool may break builds on non-glibc or older libc environments; it would be safer to guard this with configure-time checks or provide a fallback path when `getauxval` is unavailable.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
/approve It looks linux-6.6.y needs this too. BTW, do other arch have this? |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Avenger-285714, dongert The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Pull request overview
This PR accelerates genksyms symbol hashing on arm64 hosts by detecting CRC32 CPU support at runtime and using the arm64 crc32b instruction when available, falling back to the existing table-based CRC32 otherwise.
Changes:
- Add runtime detection for arm64 CRC32 hardware capability via
getauxval(AT_HWCAP). - Implement a hardware-accelerated CRC32 byte update path using inline asm (
crc32b) and use it inpartial_crc32[_one]. - Initialize the hardware capability check early in
main().
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| #ifdef __aarch64__ | ||
| #include <sys/auxv.h> | ||
| #include <asm/hwcap.h> | ||
|
|
||
| static void crc32_check_hw(void) | ||
| { | ||
| crc32_hw_available = (getauxval(AT_HWCAP) & HWCAP_CRC32) != 0; | ||
| } |
| #include <sys/auxv.h> | ||
| #include <asm/hwcap.h> | ||
|
|
||
| static void crc32_check_hw(void) | ||
| { | ||
| crc32_hw_available = (getauxval(AT_HWCAP) & HWCAP_CRC32) != 0; | ||
| } |
maillist inclusion category: performance Use hardware 'crc32b' to build genksyms when support, it shows 2x speed up than crctab32 way. Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
16b048d to
94b44fd
Compare
I will take it to 6.6, |
If it's a multiarch feature, plz consider to do something such as making a light framework to make the code more pretty. |
It has be included in the code now. |
for example: |
|
It keep things simple, or why not use crc32 from <zlib.h>, which from zlib1g-dev and zlib1g. |
I'm not sure, maybe it's really better... |
maillist inclusion
category: performance
Use hardware 'crc32b' to build genksyms when support, it shows 2x speed up than crctab32 way.
Summary by Sourcery
Enhancements: