Skip to content

[Deepin-Kernel-SIG] [linux 6.18.y] [FROMLIST] genksyms: Support arm64 CRC32 hardware acceleration#1759

Open
opsiff wants to merge 1 commit into
deepin-community:linux-6.18.yfrom
opsiff:linux-6.18.y-2026-05-25
Open

[Deepin-Kernel-SIG] [linux 6.18.y] [FROMLIST] genksyms: Support arm64 CRC32 hardware acceleration#1759
opsiff wants to merge 1 commit into
deepin-community:linux-6.18.yfrom
opsiff:linux-6.18.y-2026-05-25

Conversation

@opsiff
Copy link
Copy Markdown
Member

@opsiff opsiff commented May 25, 2026

maillist inclusion
category: performance

Use hardware 'crc32b' to build genksyms when support, it shows 2x speed up than crctab32 way.

Summary by Sourcery

Enhancements:

  • Detect arm64 CRC32 hardware support at runtime in genksyms and use hardware-accelerated CRC32 operations for symbol hashing when available.

@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented May 25, 2026

Reviewer's Guide

Adds optional arm64 hardware-accelerated CRC32 support to genksyms, falling back to the existing table-based implementation when unavailable, and wires the detection into program startup.

Sequence diagram for arm64 CRC32 hardware acceleration in genksyms

sequenceDiagram
    participant Main
    participant crc32_check_hw
    participant partial_crc32
    participant partial_crc32_one
    participant crc32_hw_byte

    Main->>crc32_check_hw: crc32_check_hw()
    crc32_check_hw-->>Main: crc32_hw_available flag

    Main->>partial_crc32: partial_crc32(s, crc)
    alt [crc32_hw_available] on aarch64
        loop for each byte c in s
            partial_crc32->>crc32_hw_byte: crc32_hw_byte(c, crc)
            crc32_hw_byte-->>partial_crc32: updated crc
        end
    else [hardware unavailable or non aarch64]
        loop for each byte c in s
            partial_crc32->>partial_crc32_one: partial_crc32_one(c, crc)
            alt [crc32_hw_available] on aarch64
                partial_crc32_one->>crc32_hw_byte: crc32_hw_byte(c, crc)
                crc32_hw_byte-->>partial_crc32_one: updated crc
            else [crc32_hw_available false]
                partial_crc32_one-->>partial_crc32: crctab32-based crc
            end
        end
    end
    partial_crc32-->>Main: final crc
Loading

File-Level Changes

Change Details Files
Introduce runtime detection and use of arm64 CRC32 hardware instructions in genksyms while preserving the existing software CRC path as a fallback.
  • Add architecture-specific helpers to detect CRC32 hardware support via getauxval(HWCAP_CRC32) on aarch64 and track it in a global flag
  • Implement an inline helper that performs a single-byte CRC32 update using the arm64 crc32b instruction when hardware is available
  • Modify the byte-wise CRC routines to branch to the hardware-accelerated path on aarch64 using __builtin_expect hints, otherwise retain the crctab32-based implementation
  • Initialize hardware detection early in main() so all subsequent CRC computations can leverage the hardware path when present
scripts/genksyms/genksyms.c

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The inline assembly uses .arch_extension crc and crc32b directly, which may not be accepted by all aarch64 assemblers/toolchains; consider using compiler intrinsics (e.g., __builtin_aarch64_crc32b) or a feature macro check to improve portability.
  • Introducing a hard dependency on getauxval/HWCAP_CRC32 for the host tool may break builds on non-glibc or older libc environments; it would be safer to guard this with configure-time checks or provide a fallback path when getauxval is unavailable.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The inline assembly uses `.arch_extension crc` and `crc32b` directly, which may not be accepted by all aarch64 assemblers/toolchains; consider using compiler intrinsics (e.g., `__builtin_aarch64_crc32b`) or a feature macro check to improve portability.
- Introducing a hard dependency on `getauxval`/`HWCAP_CRC32` for the host tool may break builds on non-glibc or older libc environments; it would be safer to guard this with configure-time checks or provide a fallback path when `getauxval` is unavailable.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@Avenger-285714
Copy link
Copy Markdown
Member

/approve

It looks linux-6.6.y needs this too.

BTW, do other arch have this?

@deepin-ci-robot
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Avenger-285714, dongert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR accelerates genksyms symbol hashing on arm64 hosts by detecting CRC32 CPU support at runtime and using the arm64 crc32b instruction when available, falling back to the existing table-based CRC32 otherwise.

Changes:

  • Add runtime detection for arm64 CRC32 hardware capability via getauxval(AT_HWCAP).
  • Implement a hardware-accelerated CRC32 byte update path using inline asm (crc32b) and use it in partial_crc32[_one].
  • Initialize the hardware capability check early in main().

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +124 to +131
#ifdef __aarch64__
#include <sys/auxv.h>
#include <asm/hwcap.h>

static void crc32_check_hw(void)
{
crc32_hw_available = (getauxval(AT_HWCAP) & HWCAP_CRC32) != 0;
}
Comment on lines +125 to +131
#include <sys/auxv.h>
#include <asm/hwcap.h>

static void crc32_check_hw(void)
{
crc32_hw_available = (getauxval(AT_HWCAP) & HWCAP_CRC32) != 0;
}
maillist inclusion
category: performance

Use hardware 'crc32b' to build genksyms when support,
it shows 2x speed up than crctab32 way.

Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
@opsiff opsiff force-pushed the linux-6.18.y-2026-05-25 branch from 16b048d to 94b44fd Compare May 25, 2026 06:59
@opsiff
Copy link
Copy Markdown
Member Author

opsiff commented May 25, 2026

/approve

It looks linux-6.6.y needs this too.

BTW, do other arch have this?

I will take it to 6.6,
LoongArch will have it.

@Avenger-285714
Copy link
Copy Markdown
Member

/approve
It looks linux-6.6.y needs this too.
BTW, do other arch have this?

I will take it to 6.6, LoongArch will have it.

If it's a multiarch feature, plz consider to do something such as making a light framework to make the code more pretty.

@opsiff
Copy link
Copy Markdown
Member Author

opsiff commented May 25, 2026

If it's a multiarch feature, plz consider to do something such as making a light framework to make the code more pretty.

It has be included in the code now.

@opsiff
Copy link
Copy Markdown
Member Author

opsiff commented May 25, 2026

/approve
It looks linux-6.6.y needs this too.
BTW, do other arch have this?

I will take it to 6.6, LoongArch will have it.

If it's a multiarch feature, plz consider to do something such as making a light framework to make the code more pretty.

for example:
just add crc32_check_hw() and crc32_hw_byte() for LoongArch version.

@Avenger-285714
Copy link
Copy Markdown
Member

It has be included in the code now.

#if defined(__aarch64__) xxx; #endif
It's ur framwork? ...

@opsiff
Copy link
Copy Markdown
Member Author

opsiff commented May 25, 2026

It has be included in the code now.

#if defined(__aarch64__) xxx; #endif It's ur framwork? ...

It keep things simple, or why not use crc32 from <zlib.h>, which from zlib1g-dev and zlib1g.

@Avenger-285714
Copy link
Copy Markdown
Member

It keep things simple, or why not use crc32 from <zlib.h>, which from zlib1g-dev and zlib1g.

I'm not sure, maybe it's really better...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants