diff --git a/.ai-context b/.ai-context
index ccbc7f7..800b82a 100644
--- a/.ai-context
+++ b/.ai-context
@@ -80,9 +80,9 @@ cd build/build && ./utf_strings-bench
### Key Files to Understand
- `CMakeLists.txt`: Main build configuration
- `conanfile.py`: Dependency management
-- `include/utf/utf_strings.hpp`: Main library header
+- `include/utf/utf_codepoints.hpp`: Main library header
- `include/utf/version.hpp`: Version information
-- `src/utf_strings.cpp`: Core implementation
+- `src/utf_codepoints.cpp`: Core implementation
### Common Tasks
1. **Adding new UTF conversion functions**: Update both header and implementation
@@ -99,6 +99,79 @@ cd build/build && ./utf_strings-bench
### Platform Support
- **Linux**: GCC 13+, Clang 18+ (primary development)
- **Windows**: MSVC 2022, Clang-CL (CI tested)
+
+### π¨ MANDATORY CODE REVIEW REQUIREMENTS
+
+**CRITICAL**: All code must undergo comprehensive review before every push to origin using these parameters:
+
+#### **Code Review Parameters**
+
+1. **π Security Analysis** (HIGHEST PRIORITY)
+ - [ ] Check for undefined behavior (UB) - memory safety violations
+ - [ ] Validate memory safety - no dangling pointers, use-after-free
+ - [ ] Look for buffer overflows and underflows
+ - [ ] Check for integer overflows/underflows
+ - [ ] Assess input validation (bounds checking, null checks)
+ - [ ] Verify proper error handling for security-critical paths
+
+2. **β‘ Performance Analysis**
+ - [ ] Evaluate algorithmic complexity (Big-O analysis)
+ - [ ] Check for unnecessary allocations and copies
+ - [ ] Identify performance bottlenecks
+ - [ ] Verify move semantics used appropriately
+ - [ ] Check for redundant temporary objects
+
+3. **π Correctness Issues**
+ - [ ] Identify bugs and logic errors
+ - [ ] Check edge cases (empty inputs, max values, boundary conditions)
+ - [ ] Validate error handling (exceptions, optional returns)
+ - [ ] Ensure proper initialization of all variables
+ - [ ] Check const correctness and immutability
+
+4. **π C++ Core Guidelines & Modern C++23**
+ - [ ] Validate RAII usage (Resource Acquisition Is Initialization)
+ - [ ] Check exception safety (basic/strong/no-throw guarantee)
+ - [ ] Verify proper use of `[[nodiscard]]`, `noexcept` attributes
+ - [ ] Check for appropriate use of `std::optional`, `std::variant`
+ - [ ] Assess template metaprogramming best practices
+
+5. **ποΈ Design Issues**
+ - [ ] Evaluate API design consistency
+ - [ ] Check naming conventions consistency
+ - [ ] Assess abstraction levels and encapsulation
+ - [ ] Review for unnecessary complexity or over-engineering
+ - [ ] Verify dependency management
+
+6. **π Documentation & Testing**
+ - [ ] Check for adequate inline comments
+ - [ ] Verify API documentation completeness
+ - [ ] Ensure comprehensive unit test coverage
+ - [ ] Check for edge case testing
+ - [ ] Validate error path testing
+
+#### **Review Output Requirements**
+
+**Categorize ALL findings by severity:**
+- π΄ **Critical** - Must fix before production (security, UB, crashes)
+- π‘ **Important** - Should fix (performance, correctness, maintainability)
+- π’ **Nice to have** - Optional improvements (style, minor optimizations)
+
+**Provide Score Card:**
+Rate each category (A+ to F) with overall grade and production readiness assessment.
+
+**For Critical Issues:**
+- Provide exact code fixes
+- Explain the issue clearly
+- Show before/after code examples
+
+#### **Key Focus Areas for UTF Strings Library:**
+- **Production readiness** - Code must be deployable
+- **Security** - Especially UB, overflow, memory safety in UTF processing
+- **Performance optimization** - UTF processing efficiency is critical
+- **Modern C++23 practices** - Leverage language features appropriately
+- **Cross-platform compatibility** - Must work on Linux/Windows/macOS
+
+**Standard**: Every change must have ZERO π΄ Critical issues before merging.
- **Architectures**: x64 (primary), others untested
### Performance Considerations
diff --git a/.copilot-instructions.md b/.copilot-instructions.md
index 0a4c57e..94abf83 100644
--- a/.copilot-instructions.md
+++ b/.copilot-instructions.md
@@ -143,9 +143,64 @@ When suggesting build changes:
- Add appropriate compiler flags for new features
- Update CI workflows if needed
+## π¨ MANDATORY PRE-PUSH CODE REVIEW
+
+**ABSOLUTE REQUIREMENT**: Every code change must undergo comprehensive review using these parameters before ANY push to origin:
+
+### **Security Review (CRITICAL - ZERO TOLERANCE)**
+- [ ] **Memory Safety**: No dangling pointers, use-after-free, or memory leaks
+- [ ] **Undefined Behavior**: No UB in any code path (especially UTF processing)
+- [ ] **Buffer Safety**: No overflows/underflows in byte array operations
+- [ ] **Integer Safety**: No overflows in size calculations or conversions
+- [ ] **Input Validation**: All external input properly validated and bounds-checked
+
+### **Performance & Correctness Review**
+- [ ] **Algorithm Efficiency**: Optimal Big-O complexity for UTF operations
+- [ ] **Memory Efficiency**: No unnecessary allocations or copies
+- [ ] **Edge Case Handling**: Empty strings, max values, boundary conditions
+- [ ] **Error Handling**: Proper exception safety and error propagation
+- [ ] **Move Semantics**: Efficient resource management
+
+### **Modern C++23 Compliance**
+- [ ] **Feature Usage**: Appropriate use of concepts, constexpr, std::expected
+- [ ] **RAII Compliance**: Proper resource management
+- [ ] **API Design**: Consistent with project patterns and std library
+- [ ] **Template Design**: Proper constraints and error messages
+
+### **Review Output Format**
+**MUST categorize ALL findings:**
+- π΄ **Critical** - BLOCKS merge (security, UB, crashes)
+- π‘ **Important** - Should fix (performance, correctness)
+- π’ **Optional** - Nice to have (style improvements)
+
+**For π΄ Critical issues - provide:**
+1. Exact description of the problem
+2. Security/safety implications
+3. Complete fix with before/after code
+4. Verification steps
+
+### **UTF Strings Library Specific Focus**
+- **UTF Processing Safety**: Validate all UTF sequence handling
+- **Endianness Handling**: Correct byte order management
+- **Performance Critical**: UTF conversion efficiency
+- **Cross-Platform**: Works on Linux/Windows/macOS
+- **Production Ready**: Zero critical issues before merge
+
+### **Pre-Push Validation Checklist**
+**MUST complete ALL before any `git push`:**
+1. β
Complete security review (zero π΄ Critical issues)
+2. β
All builds pass (`./bootstrap_cmake.sh --compiler clang`)
+3. β
All tests pass (100% success rate)
+4. β
Benchmarks run without regression
+5. β
Code formatting compliant
+6. β
License headers present on all new files
+7. β
Comprehensive unit tests for all new code
+
+**Standard**: **ZERO TOLERANCE** for π΄ Critical security or correctness issues.
+
## Current Focus Areas
-- UTF validation algorithms
-- Conversion between UTF-8/16/32
-- Performance optimization
+- UTF validation algorithms (with security focus)
+- Conversion between UTF-8/16/32 (performance critical)
+- Memory safety in byte array operations
- Cross-platform compatibility
-- Security hardening
\ No newline at end of file
+- Security hardening and review compliance
\ No newline at end of file
diff --git a/.github/README.md b/.github/README.md
index 031c164..c2a9f10 100644
--- a/.github/README.md
+++ b/.github/README.md
@@ -132,7 +132,7 @@ utf_strings-v{version}-{Platform}/
βββ utf_strings-bench-release # Optimized benchmark executable
βββ utf_strings-bench-debug # Debug benchmark executable
βββ include/utf/ # Complete header files
-β βββ utf_strings.hpp
+β βββ utf_codepoints.hpp
β βββ version.hpp
βββ *.a/*.lib # Static libraries
βββ LICENSE # License file
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index 2e6e682..bec5a96 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -8,6 +8,14 @@ on:
release:
types: [published]
+# Global permissions for the entire workflow
+permissions:
+ contents: read
+ issues: write
+ checks: write
+ pull-requests: write
+ actions: read
+
env:
CONAN_USER_HOME: "${{ github.workspace }}/conan-cache"
@@ -435,6 +443,337 @@ jobs:
build/build/test_results_clang_release.xml
build/build/benchmark_results_clang.json
+ # ============================================================================
+ # Linux x64 - Clang Debug with Code Coverage
+ # ============================================================================
+
+ linux-clang-coverage:
+ name: "Linux Clang Code Coverage (x64)"
+ runs-on: ubuntu-24.04
+ needs: [yaml-validation]
+ if: always() && (needs.yaml-validation.result == 'success' || needs.yaml-validation.result == 'skipped')
+
+ steps:
+ - name: Checkout
+ uses: actions/checkout@v4
+
+ - name: Install system dependencies
+ timeout-minutes: 15
+ run: |
+ set -e # Exit on any error
+
+ # Update package list first
+ sudo apt update
+
+ # Install basic dependencies first (these are fast and reliable)
+ sudo apt install -y cmake python3-pip git wget curl bc
+
+ # Install Clang 18 with timeout and retry logic
+ echo "Installing Clang 18..."
+ for i in {1..3}; do
+ if timeout 10m bash -c "
+ wget -q https://apt.llvm.org/llvm.sh &&
+ chmod +x llvm.sh &&
+ sudo ./llvm.sh 18
+ "; then
+ echo "LLVM installation successful on attempt $i"
+ break
+ else
+ echo "LLVM installation attempt $i failed, retrying..."
+ rm -f llvm.sh
+ if [ $i -eq 3 ]; then
+ echo "All LLVM installation attempts failed"
+ exit 1
+ fi
+ sleep 30
+ fi
+ done
+
+ # Install code coverage tools
+ sudo apt install -y llvm-18 llvm-18-dev clang-format-18 || {
+ echo "Some LLVM tools not available, trying alternatives..."
+ sudo apt install -y llvm llvm-dev clang-format || echo "Some tools installation failed"
+ }
+
+ # Set up symlinks
+ sudo ln -sf /usr/bin/clang-18 /usr/bin/clang
+ sudo ln -sf /usr/bin/clang++-18 /usr/bin/clang++
+ sudo ln -sf /usr/bin/llvm-profdata-18 /usr/bin/llvm-profdata || sudo ln -sf /usr/bin/llvm-profdata /usr/bin/llvm-profdata
+ sudo ln -sf /usr/bin/llvm-cov-18 /usr/bin/llvm-cov || sudo ln -sf /usr/bin/llvm-cov /usr/bin/llvm-cov
+ if command -v clang-format-18 >/dev/null 2>&1; then
+ sudo ln -sf /usr/bin/clang-format-18 /usr/bin/clang-format
+ fi
+
+ # Verify installation
+ clang --version
+ clang++ --version
+ llvm-profdata-18 --version || echo "llvm-profdata not available"
+ llvm-cov-18 --version || echo "llvm-cov not available"
+
+ - name: Setup Python
+ uses: actions/setup-python@v4
+ with:
+ python-version: "3.11"
+
+ - name: Install Conan
+ run: |
+ pip3 install --user conan
+ echo "$HOME/.local/bin" >> $GITHUB_PATH
+
+ - name: Setup Conan profile
+ run: |
+ conan profile detect --force
+ # Use static profile for Clang 18 with C++23 (code coverage)
+ sed -i 's/compiler=.*/compiler=clang/' ~/.conan2/profiles/default
+ sed -i 's/compiler.version=.*/compiler.version=18/' ~/.conan2/profiles/default
+ sed -i 's/compiler.libcxx=.*/compiler.libcxx=libstdc++11/' ~/.conan2/profiles/default
+ sed -i 's/compiler.cppstd=.*/compiler.cppstd=23/' ~/.conan2/profiles/default
+ sed -i 's/os=.*/os=Linux/' ~/.conan2/profiles/default
+ sed -i 's/arch=.*/arch=x86_64/' ~/.conan2/profiles/default
+ echo "=== Conan Profile ==="
+ conan profile show --profile:host=default
+
+ - name: Cache Conan packages
+ uses: actions/cache@v4
+ with:
+ path: ~/.conan2
+ key: conan-linux-clang-coverage-${{ hashFiles('conanfile.py') }}
+ restore-keys: |
+ conan-linux-clang-coverage-
+ conan-linux-clang-
+
+ - name: Install dependencies
+ run: |
+ export CC=clang
+ export CXX=clang++
+ conan install . -s os=Linux -s build_type=Debug --output-folder=build --build=missing -o with_gperftools=True
+
+ - name: Configure CMake with Coverage
+ run: |
+ export CC=clang
+ export CXX=clang++
+ cmake -S . -B build/Coverage \
+ -DCMAKE_TOOLCHAIN_FILE=build/conan_toolchain.cmake \
+ -DCMAKE_BUILD_TYPE=Debug \
+ -DCMAKE_CXX_COMPILER=clang++ \
+ -DCMAKE_C_COMPILER=clang \
+ -DCOMPILER_TYPE=CLANG \
+ -DUSE_LTO=OFF \
+ -DUSE_NATIVE_ARCH=OFF \
+ -DENABLE_SHARED_LIBRARY=ON \
+ -DUTF_STRINGS_BUILD_TESTS=ON \
+ -DUTF_STRINGS_BUILD_BENCHMARKS=OFF \
+ -DUTF_STRINGS_BUILD_FUZZ_TESTS=OFF \
+ -DUTF_STRINGS_WITH_GPERFTOOLS=OFF \
+ -DCMAKE_CXX_FLAGS="--coverage -fprofile-instr-generate -fcoverage-mapping" \
+ -DCMAKE_C_FLAGS="--coverage -fprofile-instr-generate -fcoverage-mapping" \
+ -DCMAKE_EXE_LINKER_FLAGS="--coverage"
+
+ - name: Build with Coverage
+ run: cmake --build build/Coverage --parallel
+
+ - name: Run Tests with Coverage
+ run: |
+ cd build/Coverage
+ # Set up coverage data collection
+ export LLVM_PROFILE_FILE="utf_strings_coverage.profraw"
+
+ # Run the tests
+ ./utf_strings-tests --gtest_output=xml:test_results_coverage.xml
+
+ # Verify coverage data was generated
+ ls -la *.profraw || echo "No .profraw files found"
+
+ - name: Process Coverage Data
+ run: |
+ cd build/Coverage
+
+ # Convert raw profile data to indexed format
+ if [ -f "utf_strings_coverage.profraw" ]; then
+ echo "Converting coverage data..."
+ # Try versioned commands first, fall back to unversioned
+ if command -v llvm-profdata-18 >/dev/null 2>&1; then
+ LLVM_PROFDATA=llvm-profdata-18
+ LLVM_COV=llvm-cov-18
+ else
+ LLVM_PROFDATA=llvm-profdata
+ LLVM_COV=llvm-cov
+ fi
+
+ $LLVM_PROFDATA merge -sparse utf_strings_coverage.profraw -o utf_strings_coverage.profdata
+
+ # Generate coverage report in text format
+ echo "Generating text coverage report..."
+ $LLVM_COV report ./utf_strings-tests \
+ -instr-profile=utf_strings_coverage.profdata \
+ -ignore-filename-regex="(test|gtest|benchmark|conan|build)" \
+ > coverage_report.txt
+
+ # Generate detailed coverage report in HTML format
+ echo "Generating HTML coverage report..."
+ $LLVM_COV show ./utf_strings-tests \
+ -instr-profile=utf_strings_coverage.profdata \
+ -format=html \
+ -output-dir=coverage_html \
+ -ignore-filename-regex="(test|gtest|benchmark|conan|build)" \
+ -show-line-counts-or-regions \
+ -show-instantiations
+
+ # Generate coverage summary in JSON format
+ echo "Generating JSON coverage summary..."
+ $LLVM_COV export ./utf_strings-tests \
+ -instr-profile=utf_strings_coverage.profdata \
+ -format=text \
+ -ignore-filename-regex="(test|gtest|benchmark|conan|build)" \
+ > coverage_summary.json
+
+ # Display coverage summary
+ echo "=== Coverage Summary ==="
+ cat coverage_report.txt
+ echo "========================"
+
+ else
+ echo "β No coverage data found!"
+ echo "Available files:"
+ ls -la
+ exit 1
+ fi
+
+ - name: Generate Coverage Badge Data
+ run: |
+ cd build/Coverage
+ if [ -f "coverage_summary.json" ]; then
+ # Extract coverage percentage from JSON report
+ COVERAGE=$(python3 -c "
+ import json
+ import sys
+ try:
+ with open('coverage_summary.json', 'r') as f:
+ data = json.load(f)
+
+ # Extract totals from llvm-cov JSON format
+ totals = data['data'][0]['totals']
+ lines = totals['lines']
+ covered = lines['covered']
+ count = lines['count']
+
+ if count > 0:
+ percentage = round((covered / count) * 100, 1)
+ print(f'{percentage}')
+ else:
+ print('0.0')
+ except Exception as e:
+ print(f'Error: {e}', file=sys.stderr)
+ print('0.0')
+ ")
+
+ echo "COVERAGE_PERCENTAGE=$COVERAGE" >> $GITHUB_ENV
+ echo "Coverage: $COVERAGE%"
+
+ # Create badge color based on coverage percentage
+ if (( $(echo "$COVERAGE >= 90" | bc -l) )); then
+ BADGE_COLOR="brightgreen"
+ elif (( $(echo "$COVERAGE >= 80" | bc -l) )); then
+ BADGE_COLOR="green"
+ elif (( $(echo "$COVERAGE >= 70" | bc -l) )); then
+ BADGE_COLOR="yellowgreen"
+ elif (( $(echo "$COVERAGE >= 60" | bc -l) )); then
+ BADGE_COLOR="yellow"
+ elif (( $(echo "$COVERAGE >= 50" | bc -l) )); then
+ BADGE_COLOR="orange"
+ else
+ BADGE_COLOR="red"
+ fi
+
+ echo "BADGE_COLOR=$BADGE_COLOR" >> $GITHUB_ENV
+
+ # Create a simple coverage summary for artifact
+ cat > coverage_summary.txt << EOF
+ UTF Strings Library - Code Coverage Report
+ ==========================================
+ Generated: $(date)
+ Build: Clang Debug with Coverage
+ Test Suite: All unit tests
+
+ Coverage: $COVERAGE%
+ Badge Color: $BADGE_COLOR
+
+ Detailed reports available in coverage_html/ directory
+ EOF
+
+ else
+ echo "COVERAGE_PERCENTAGE=0.0" >> $GITHUB_ENV
+ echo "BADGE_COLOR=red" >> $GITHUB_ENV
+ fi
+
+ - name: Upload Coverage Reports
+ uses: actions/upload-artifact@v4
+ if: always()
+ with:
+ name: coverage-reports-linux-clang
+ path: |
+ build/Coverage/coverage_report.txt
+ build/Coverage/coverage_summary.json
+ build/Coverage/coverage_summary.txt
+ build/Coverage/coverage_html/
+ build/Coverage/test_results_coverage.xml
+ retention-days: 30
+
+ - name: Upload Coverage to Codecov (Optional)
+ uses: codecov/codecov-action@v3
+ if: always()
+ with:
+ files: build/Coverage/coverage_summary.json
+ flags: unittests
+ name: utf-strings-coverage
+ fail_ci_if_error: false
+ verbose: true
+
+ - name: Comment Coverage on PR
+ if: github.event_name == 'pull_request'
+ uses: actions/github-script@v7
+ with:
+ github-token: ${{ secrets.GITHUB_TOKEN }}
+ script: |
+ const coverage = process.env.COVERAGE_PERCENTAGE;
+ const badgeColor = process.env.BADGE_COLOR;
+
+ const comment = `## π Code Coverage Report
+
+ **Coverage:** ${coverage}%
+
+ 
+
+ π Coverage Details
+
+ - **Build:** Clang Debug with Coverage Instrumentation
+ - **Test Suite:** All unit tests (65 tests across 22 test suites)
+ - **Generated:** ${new Date().toISOString()}
+
+ π **Artifacts Generated:**
+ - π Text report: \`coverage_report.txt\`
+ - π HTML report: \`coverage_html/index.html\`
+ - π JSON summary: \`coverage_summary.json\`
+ - β
Test results: \`test_results_coverage.xml\`
+
+
High-Performance UTF String Processing for C++23
+API Documentation
+New to the UTF Strings library? Check out the main documentation for getting started guides, tutorials, and examples.
+ +Generated: $(date)
+ Commit: ${{ github.sha }}
+ Branch: ${{ github.ref_name }}
$(cat doxygen.log)+ + + EOF + + - name: Upload documentation artifacts + uses: actions/upload-artifact@v4 + with: + name: api-documentation + path: | + docs/api/ + doxygen.log + retention-days: 30 + + - name: Comment documentation status on PR + if: github.event_name == 'pull_request' + uses: actions/github-script@v7 + with: + github-token: ${{ secrets.GITHUB_TOKEN }} + script: | + const warnings = process.env.DOXYGEN_WARNINGS; + const errors = process.env.DOXYGEN_ERRORS; + const htmlFiles = process.env.HTML_FILES; + const headerFiles = process.env.HEADER_FILES; + const sourceFiles = process.env.SOURCE_FILES; + const classes = process.env.CLASSES; + + const status = errors === '0' ? 'β Success' : 'β Failed'; + const warningsBadge = warnings === '0' ? 'π’' : warnings < '5' ? 'π‘' : 'π΄'; + + const comment = `## π API Documentation Build Report + + **Status:** ${status} + + ### π Statistics + - **Header Files:** ${headerFiles} + - **Source Files:** ${sourceFiles} + - **Classes/Structs:** ${classes} + - **Generated Pages:** ${htmlFiles} + + ### π Quality Metrics + - **Doxygen Warnings:** ${warningsBadge} ${warnings} + - **Doxygen Errors:** ${errors === '0' ? 'β ' : 'β'} ${errors} + +
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
| ▼Nutf | Root namespace for the UTF Strings library |
| ▼Nencodings | UTF encoding type definitions |
| CUtf16 | UTF-16 encoding specification |
| CUtf32 | UTF-32 encoding specification |
| CUtf8 | UTF-8 encoding specification |
| ▼Nstring | |
| CCodePointIterator | Iterator for traversing UTF-encoded strings as code points |
| CSmallStringBuffer | Small buffer optimization for UTF strings |
| CString | Owning container for UTF-encoded strings with Small String Optimization |
| CStringView | Non-owning view of a UTF-encoded string |
| CCodePoint | |
| CUnicodeScalar | Strong type wrapper for Unicode scalar values |
| Cversion | Version information for the UTF Strings library |
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
Owning container for UTF-encoded strings with Small String Optimization. + More...
+ +#include <utf_strings.hpp>
+Public Types | |
| using | value_type = CodePoint< UtfType, E > |
| using | size_type = std::size_t |
| using | storage_type = typename UtfType::storage_type |
| using | iterator = CodePointIterator< UtfType, E > |
| using | const_iterator = iterator |
| using | view_type = StringView< UtfType, E > |
+Public Member Functions | |
| String ()=default | |
| Default constructor creates an empty string. | |
| String (view_type view) | |
| Construct from a view. | |
| String (const storage_type *data, size_type length) | |
| Construct from pointer and length. | |
| String (const storage_type *str) | |
| Construct from null-terminated string. | |
| template<typename Traits , typename Allocator > | |
| String (const std::basic_string< storage_type, Traits, Allocator > &str) | |
| Construct from std::basic_string. | |
| String (std::initializer_list< value_type > code_points) | |
| Construct from initializer list of code points. | |
| ValidEndianness< SrcUtfType, SrcEndian > &&!std ::same_as< SrcUtfType, UtfType > Endian SrcEndian ValidEndianness< SrcUtfType, SrcEndian > &&!std ::same_as< SrcUtfType, UtfType > Endian SrcEndian ValidEndianness< SrcUtfType, SrcEndian > bool | try_assign_from (const String< SrcUtfType, SrcEndian > &other) |
+Public Attributes | |
| template<typename SrcUtfType , Endian SrcEndian> | |
| ValidEndianness< SrcUtfType, SrcEndian > &&!std ::same_as< SrcUtfType, UtfType > | SrcEndian |
| Converting constructor from different encoding. | |
| ValidEndianness< SrcUtfType, SrcEndian > &&!std ::same_as< SrcUtfType, UtfType > Endian SrcEndian ValidEndianness< SrcUtfType, SrcEndian > &&!std ::same_as< SrcUtfType, UtfType > | SrcEndian |
Owning container for UTF-encoded strings with Small String Optimization.
+| UtfType | The UTF encoding type (Utf8, Utf16, or Utf32) |
| E | The endianness (Endian::None for UTF-8, BE or LE for UTF-16/32) |
Total size is 32 bytes. Inline capacities:
| using utf::string::String< UtfType, E >::const_iterator = iterator | +
| using utf::string::String< UtfType, E >::iterator = CodePointIterator<UtfType, E> | +
| using utf::string::String< UtfType, E >::size_type = std::size_t | +
| using utf::string::String< UtfType, E >::storage_type = typename UtfType::storage_type | +
| using utf::string::String< UtfType, E >::view_type = StringView<UtfType, E> | +
+
|
+ +default | +
Default constructor creates an empty string.
+ +
+
|
+ +inline | +
Construct from a view.
+ +
+
|
+ +inline | +
Construct from pointer and length.
+ +
+
|
+ +inlineexplicit | +
Construct from null-terminated string.
+
+
|
+ +inlineexplicit | +
Construct from std::basic_string.
+ +
+
|
+ +inline | +
Construct from initializer list of code points.
+ +
+
|
+ +inline | +
| ValidEndianness<SrcUtfType, SrcEndian>&& !std ::same_as<SrcUtfType, UtfType> utf::string::String< UtfType, E >::SrcEndian | +
Converting constructor from different encoding.
+| SrcUtfType | Source UTF encoding type |
| SrcEndian | Source endianness |
| other | String in different encoding to convert from |
| std::invalid_argument | if source contains invalid code points |
| ValidEndianness<SrcUtfType, SrcEndian>&& !std ::same_as<SrcUtfType, UtfType> Endian SrcEndian ValidEndianness<SrcUtfType, SrcEndian>&& !std ::same_as<SrcUtfType, UtfType> utf::string::String< UtfType, E >::SrcEndian | +
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
#include "../include/utf/utf_strings.hpp"+Namespaces | |
| namespace | utf |
| Root namespace for the UTF Strings library. | |
| namespace | utf::string |
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
UTF-16 encoding specification. + More...
+ +#include <utf_codepoints.hpp>
+Public Types | |
| using | storage_type = uint16_t |
+Static Public Attributes | |
| static std::size_t | unit_size = 2 |
| static std::size_t | max_units = 2 |
UTF-16 encoding specification.
+| using utf::encodings::Utf16::storage_type = uint16_t | +
+
|
+ +static | +
+
|
+ +static | +
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
Unicode-related constants and limits. +More...
+Unicode-related constants and limits.
+|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
This is the complete list of members for utf::encodings::Utf32, including all inherited members.
+| max_units | utf::encodings::Utf32 | static |
| storage_type typedef | utf::encodings::Utf32 | |
| unit_size | utf::encodings::Utf32 | static |
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
This is the complete list of members for utf::string::CodePointIterator< UtfType, E >, including all inherited members.
+|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
Go to the source code of this file.
++Macros | |
| #define | UTF_STRINGS_API |
| #define UTF_STRINGS_API | +
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
This is the complete list of members for utf::string::StringView< UtfType, E >, including all inherited members.
+|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
Endianness-related types and constants. +More...
++Enumerations | |
| enum class | Type { None +, BE +, LE + } |
| Byte order specification. More... | |
+Variables | |
| Type | none = Type::None |
| Convenience alias for byte-oriented encoding. | |
| Type | big_endian = Type::BE |
| Convenience alias for big endian. | |
| Type | little_endian = Type::LE |
| Convenience alias for little endian. | |
| Type | network_byte_order = Type::BE |
| Convenience alias for network byte order (same as big endian) | |
Endianness-related types and constants.
+
+
|
+ +strong | +
Convenience alias for big endian.
+ +Convenience alias for little endian.
+ +Convenience alias for network byte order (same as big endian)
+ +
+
|
+ +inline | +
Convenience alias for byte-oriented encoding.
+ +|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
#include <algorithm>#include <compare>#include <cstring>#include <iterator>#include <limits>#include <memory>#include <ranges>#include <stdexcept>#include <string>#include <string_view>#include <vector>#include "utf_codepoints.hpp"Go to the source code of this file.
++Classes | |
| class | utf::string::CodePointIterator< UtfType, E > |
| Iterator for traversing UTF-encoded strings as code points. More... | |
| class | utf::string::StringView< UtfType, E > |
| Non-owning view of a UTF-encoded string. More... | |
| class | utf::string::SmallStringBuffer< StorageType > |
| Small buffer optimization for UTF strings. More... | |
| class | utf::string::String< UtfType, E > |
| Owning container for UTF-encoded strings with Small String Optimization. More... | |
+Namespaces | |
| namespace | utf |
| Root namespace for the UTF Strings library. | |
| namespace | utf::string |
+Macros | |
| #define | UTF_STRING_HPP |
| #define | UTF_STRING_VERSION_MAJOR 1 |
| #define | UTF_STRING_VERSION_MINOR 3 |
| #define | UTF_STRING_VERSION_PATCH 0 |
+Typedefs | |
| using | utf::string::Utf8StringView = StringView< Utf8, Endian::None > |
| using | utf::string::Utf16BEStringView = StringView< Utf16, Endian::BE > |
| using | utf::string::Utf16LEStringView = StringView< Utf16, Endian::LE > |
| using | utf::string::Utf32BEStringView = StringView< Utf32, Endian::BE > |
| using | utf::string::Utf32LEStringView = StringView< Utf32, Endian::LE > |
| using | utf::string::Utf8String = String< Utf8, Endian::None > |
| using | utf::string::Utf16BEString = String< Utf16, Endian::BE > |
| using | utf::string::Utf16LEString = String< Utf16, Endian::LE > |
| using | utf::string::Utf32BEString = String< Utf32, Endian::BE > |
| using | utf::string::Utf32LEString = String< Utf32, Endian::LE > |
| #define UTF_STRING_HPP | +
| #define UTF_STRING_VERSION_MAJOR 1 | +
| #define UTF_STRING_VERSION_MINOR 3 | +
| #define UTF_STRING_VERSION_PATCH 0 | +
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
Small buffer optimization for UTF strings. + More...
+ +#include <utf_strings.hpp>
+Public Member Functions | |
| SmallStringBuffer () | |
| ~SmallStringBuffer () | |
| SmallStringBuffer (const SmallStringBuffer &other) | |
| SmallStringBuffer (SmallStringBuffer &&other) | |
| SmallStringBuffer & | operator= (const SmallStringBuffer &other) |
| SmallStringBuffer & | operator= (SmallStringBuffer &&other) |
| const StorageType * | data () const |
| StorageType * | data () |
| std::size_t | size () const |
| std::size_t | capacity () const |
| bool | is_inline () const |
| void | clear () |
| void | reserve (std::size_t new_capacity) |
| void | push_back (StorageType value) |
| void | append (const StorageType *src, std::size_t count) |
| void | swap (SmallStringBuffer &other) |
+Static Public Attributes | |
| static std::size_t | total_size = 32 |
| static std::size_t | metadata_size |
| static std::size_t | inline_capacity = (total_size - metadata_size) / sizeof(StorageType) |
Small buffer optimization for UTF strings.
+Strings up to 32 bytes total (including metadata) are stored inline on the stack. Actual data capacity is 32 - 2*sizeof(size_t) bytes.
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
| StorageType* utf::string::SmallStringBuffer< StorageType >::heap_data_ | +
+
|
+ +static | +
| StorageType utf::string::SmallStringBuffer< StorageType >::inline_data_[inline_capacity] | +
+
|
+ +static | +
+
|
+ +static | +
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
#include "../include/utf/utf_codepoints.hpp"+Namespaces | |
| namespace | utf |
| Root namespace for the UTF Strings library. | |
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
UTF-8 encoding specification. + More...
+ +#include <utf_codepoints.hpp>
+Public Types | |
| using | storage_type = uint8_t |
+Static Public Attributes | |
| static std::size_t | unit_size = 1 |
| static std::size_t | max_units = 4 |
UTF-8 encoding specification.
+| using utf::encodings::Utf8::storage_type = uint8_t | +
+
|
+ +static | +
+
|
+ +static | +
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
This is the complete list of members for utf::version, including all inherited members.
+| major | utf::version | static |
| minor | utf::version | static |
| number() | utf::version | inlinestatic |
| patch | utf::version | static |
| string() | utf::version | inlinestatic |
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
This is the complete list of members for utf::string::String< UtfType, E >, including all inherited members.
+|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
This is the complete list of members for utf::encodings::Utf8, including all inherited members.
+| max_units | utf::encodings::Utf8 | static |
| storage_type typedef | utf::encodings::Utf8 | |
| unit_size | utf::encodings::Utf8 | static |
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
UTF Strings library version information. +More...
+Go to the source code of this file.
++Classes | |
| struct | utf::version |
| Version information for the UTF Strings library. More... | |
+Namespaces | |
| namespace | utf |
| Root namespace for the UTF Strings library. | |
+Macros | |
| #define | UTF_STRINGS_VERSION_HPP |
| #define | UTF_STRINGS_VERSION_MAJOR 0 |
| #define | UTF_STRINGS_VERSION_MINOR 0 |
| #define | UTF_STRINGS_VERSION_PATCH 2 |
| #define | UTF_STRINGS_VERSION_STRING "0.0.2" |
| #define | UTF_STRINGS_VERSION_NUMBER 2 |
UTF Strings library version information.
+| #define UTF_STRINGS_VERSION_HPP | +
| #define UTF_STRINGS_VERSION_MAJOR 0 | +
| #define UTF_STRINGS_VERSION_MINOR 0 | +
| #define UTF_STRINGS_VERSION_NUMBER 2 | +
| #define UTF_STRINGS_VERSION_PATCH 2 | +
| #define UTF_STRINGS_VERSION_STRING "0.0.2" | +
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
Root namespace for the UTF Strings library. +More...
++Namespaces | |
| namespace | encodings |
| UTF encoding type definitions. | |
| namespace | endianness |
| Endianness-related types and constants. | |
| namespace | limits |
| Unicode-related constants and limits. | |
| namespace | string |
+Classes | |
| struct | CodePoint |
| struct | UnicodeScalar |
| Strong type wrapper for Unicode scalar values. More... | |
| struct | version |
| Version information for the UTF Strings library. More... | |
+Typedefs | |
| using | Endian = endianness::Type |
| using | Utf8 = encodings::Utf8 |
| using | Utf16 = encodings::Utf16 |
| using | Utf32 = encodings::Utf32 |
+Enumerations | |
| enum class | ErrorCode { + invalid_scalar +, overlong_encoding +, invalid_surrogate +, out_of_range +, + truncated_sequence + + } |
| Error codes for UTF operations. More... | |
+Functions | |
| const char * | get_version () |
| Get the library version as a string. | |
| int | get_version_number () |
| Get the library version as an integer. | |
| bool | version_at_least (int major, int minor=0, int patch=0) |
| Check if the library version is at least the specified version. | |
| CodePoint ()=default | |
| Default constructor creates a null character (U+0000) | |
| CodePoint (uint32_t unicode_scalar) | |
| Construct from a Unicode scalar value. | |
| static std::optional< CodePoint > | from_scalar (uint32_t scalar) |
| Factory function for safe construction. | |
| std::size_t | count () const |
| Get the number of UTF-8 code units (bytes) | |
| std::span< const uint8_t > | units () const |
| Get a span view of the valid UTF-8 bytes. | |
| const uint8_t * | data () const |
| Get direct pointer to the UTF-8 data. | |
| std::optional< uint32_t > | to_scalar () const |
| Decode to Unicode scalar value. | |
| uint32_t | to_scalar_unchecked () const |
| Decode to Unicode scalar value without validation. | |
| bool | is_valid () const |
| Check if this represents a valid UTF-8 encoded code point. | |
| std::size_t | size () const |
| Get the size in bytes. | |
| bool | operator== (uint32_t scalar) const |
| Compare with a Unicode scalar value. | |
| auto | operator<=> (const CodePoint &) const =default |
| Three-way comparison operator. | |
| void | swap (CodePoint &a, CodePoint &b) |
| Swap two code points. | |
+Variables | |
| template<typename UtfType > | |
| ByteOriented = std::same_as<UtfType, Utf8> | |
| Concept for byte-oriented UTF encodings (UTF-8) | |
| template<typename UtfType > | |
| MultiByteOriented = std::same_as<UtfType, Utf16> || std::same_as<UtfType, Utf32> | |
| Concept for multi-byte UTF encodings (UTF-16, UTF-32) | |
| template<typename UtfType , Endian E> | |
| ValidEndianness | |
| Concept validating endianness for a given encoding. | |
| template<typename T > | |
| IsCodePoint | |
| Concept to check if a type is a valid CodePoint instantiation. | |
| template<Endian E> | |
| ByteOriented< Utf8 > && | E |
| UTF-8 code point representation. | |
| static Endian | endianness = E |
| std::array< uint8_t, 4 > | rune {} |
| UTF-8 encoded bytes. | |
Root namespace for the UTF Strings library.
+This namespace contains all UTF-related functionality including:
All library functionality is accessed through this namespace or its nested namespaces (such as utf::string for string types).
+| using utf::Endian = typedef endianness::Type | +
| using utf::Utf16 = typedef encodings::Utf16 | +
| using utf::Utf32 = typedef encodings::Utf32 | +
| using utf::Utf8 = typedef encodings::Utf8 | +
+
|
+ +strong | +
Error codes for UTF operations.
+
+
|
+ +default | +
Default constructor creates a null character (U+0000)
+
+
|
+ +explicit | +
Construct from a Unicode scalar value.
+| unicode_scalar | The Unicode code point to encode (U+0000 to U+10FFFF) |
| std::size_t utf::count | +( | +) | +const | +
Get the number of UTF-8 code units (bytes)
+Get the number of UTF-32 code units (always 1)
+Get the number of UTF-16 code units.
+Get direct pointer to the UTF-8 data.
+Get direct pointer to the UTF-32 data.
+Get direct pointer to the UTF-16 data.
+
+
|
+ +static | +
Factory function for safe construction.
+| scalar | The Unicode code point to encode |
Get the library version as a string.
+
+
|
+ +inline | +
Get the library version as an integer.
+| bool utf::is_valid | +( | +) | +const | +
Check if this represents a valid UTF-8 encoded code point.
+Check if this represents a valid Unicode code point.
+Check if this represents a valid UTF-16 encoded code point.
+Validates:
Validates:
Validates:
+
|
+ +default | +
Three-way comparison operator.
+ +| bool utf::operator== | +( | +uint32_t | +scalar | ) | +const | +
Compare with a Unicode scalar value.
+| scalar | The scalar value to compare with |
| std::size_t utf::size | +( | +) | +const | +
Get the size in bytes.
+Get the size in bytes (always 4)
+Swap two code points.
+ +| std::optional< uint32_t > utf::to_scalar | +( | +) | +const | +
Decode to Unicode scalar value.
+| uint32_t utf::to_scalar_unchecked | +( | +) | +const | +
Decode to Unicode scalar value without validation.
+Get a span view of the valid UTF-8 bytes.
+Get a span view of the single UTF-32 unit.
+Get a span view of the valid UTF-16 units.
+
+
|
+ +inline | +
Check if the library version is at least the specified version.
+| major | Major version number |
| minor | Minor version number |
| patch | Patch version number |
| utf::ByteOriented = std::same_as<UtfType, Utf8> | +
Concept for byte-oriented UTF encodings (UTF-8)
+ +| MultiByteOriented< Utf32 > utf::E | +
UTF-8 code point representation.
+UTF-32 code point representation.
+UTF-16 code point representation.
+Stores a single Unicode code point encoded as UTF-8 (1-4 bytes). Total size: exactly 4 bytes (optimal alignment).
+UTF-8 is byte-oriented so endianness does not apply. Length is computed on-demand from the leading byte pattern.
+| E | Endianness (must be BE or LE, not None) |
Stores a single Unicode code point encoded as UTF-16 (1-2 units). Total size: exactly 4 bytes (optimal alignment). Handles both BMP characters (single unit) and supplementary characters (surrogate pairs). Length is computed on-demand from surrogate detection.
+| E | Endianness (must be BE or LE, not None) |
Stores a single Unicode code point as a single UTF-32 unit. Total size: exactly 4 bytes (optimal alignment). This is the simplest encoding where one unit always equals one code point.
+Concept to check if a type is a valid CodePoint instantiation.
+ +Concept for multi-byte UTF encodings (UTF-16, UTF-32)
+ +| uint32_t utf::rune {} | +
UTF-8 encoded bytes.
+The UTF-32 encoded unit (stored in target endianness)
+UTF-16 encoded units (stored in target endianness)
+ +Concept validating endianness for a given encoding.
+UTF-8 must use Endian::None, UTF-16/32 must use BE or LE
+ +|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
Iterator for traversing UTF-encoded strings as code points. + More...
+ +#include <utf_strings.hpp>
+Public Types | |
| using | iterator_category = std::forward_iterator_tag |
| using | value_type = CodePoint< UtfType, E > |
| using | difference_type = std::ptrdiff_t |
| using | pointer = const value_type * |
| using | reference = value_type |
+Public Member Functions | |
| CodePointIterator ()=default | |
| CodePointIterator (const typename UtfType::storage_type *ptr, const typename UtfType::storage_type *end) | |
| reference | operator* () const |
| pointer | operator-> () const |
| CodePointIterator & | operator++ () |
| CodePointIterator | operator++ (int) |
| bool | operator== (const CodePointIterator &other) const |
| bool | operator!= (const CodePointIterator &other) const |
| const UtfType::storage_type * | position () const |
| Get the current position in the underlying buffer. | |
Iterator for traversing UTF-encoded strings as code points.
+| UtfType | The UTF encoding type (Utf8, Utf16, or Utf32) |
| E | The endianness (Endian::None for UTF-8, BE or LE for UTF-16/32) |
| using utf::string::CodePointIterator< UtfType, E >::difference_type = std::ptrdiff_t | +
| using utf::string::CodePointIterator< UtfType, E >::iterator_category = std::forward_iterator_tag | +
| using utf::string::CodePointIterator< UtfType, E >::pointer = const value_type* | +
| using utf::string::CodePointIterator< UtfType, E >::reference = value_type | +
| using utf::string::CodePointIterator< UtfType, E >::value_type = CodePoint<UtfType, E> | +
+
|
+ +default | +
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
+
|
+ +inline | +
Get the current position in the underlying buffer.
+ +|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
#include <array>#include <bit>#include <concepts>#include <cstdint>#include <optional>#include <span>Go to the source code of this file.
++Classes | |
| struct | utf::encodings::Utf8 |
| UTF-8 encoding specification. More... | |
| struct | utf::encodings::Utf16 |
| UTF-16 encoding specification. More... | |
| struct | utf::encodings::Utf32 |
| UTF-32 encoding specification. More... | |
| struct | utf::UnicodeScalar |
| Strong type wrapper for Unicode scalar values. More... | |
+Namespaces | |
| namespace | utf |
| Root namespace for the UTF Strings library. | |
| namespace | utf::limits |
| Unicode-related constants and limits. | |
| namespace | utf::endianness |
| Endianness-related types and constants. | |
| namespace | utf::encodings |
| UTF encoding type definitions. | |
+Macros | |
| #define | UTF_CODEPOINT_HPP |
| #define | UTF_CODEPOINT_VERSION_MAJOR 0 |
| #define | UTF_CODEPOINT_VERSION_MINOR 0 |
| #define | UTF_CODEPOINT_VERSION_PATCH 2 |
+Typedefs | |
| using | utf::Endian = endianness::Type |
| using | utf::Utf8 = encodings::Utf8 |
| using | utf::Utf16 = encodings::Utf16 |
| using | utf::Utf32 = encodings::Utf32 |
+Enumerations | |
| enum class | utf::ErrorCode { + utf::invalid_scalar +, utf::overlong_encoding +, utf::invalid_surrogate +, utf::out_of_range +, + utf::truncated_sequence + + } |
| Error codes for UTF operations. More... | |
| enum class | utf::endianness::Type { utf::endianness::None +, utf::endianness::BE +, utf::endianness::LE + } |
| Byte order specification. More... | |
+Functions | |
| utf::CodePoint ()=default | |
| Default constructor creates a null character (U+0000) | |
| utf::CodePoint (uint32_t unicode_scalar) | |
| Construct from a Unicode scalar value. | |
| static std::optional< CodePoint > | utf::from_scalar (uint32_t scalar) |
| Factory function for safe construction. | |
| std::size_t | utf::count () const |
| Get the number of UTF-8 code units (bytes) | |
| std::span< const uint8_t > | utf::units () const |
| Get a span view of the valid UTF-8 bytes. | |
| const uint8_t * | utf::data () const |
| Get direct pointer to the UTF-8 data. | |
| std::optional< uint32_t > | utf::to_scalar () const |
| Decode to Unicode scalar value. | |
| uint32_t | utf::to_scalar_unchecked () const |
| Decode to Unicode scalar value without validation. | |
| bool | utf::is_valid () const |
| Check if this represents a valid UTF-8 encoded code point. | |
| std::size_t | utf::size () const |
| Get the size in bytes. | |
| bool | utf::operator== (uint32_t scalar) const |
| Compare with a Unicode scalar value. | |
| auto | utf::operator<=> (const CodePoint &) const =default |
| Three-way comparison operator. | |
| void | utf::swap (CodePoint &a, CodePoint &b) |
| Swap two code points. | |
+Variables | |
| Type | utf::endianness::none = Type::None |
| Convenience alias for byte-oriented encoding. | |
| Type | utf::endianness::big_endian = Type::BE |
| Convenience alias for big endian. | |
| Type | utf::endianness::little_endian = Type::LE |
| Convenience alias for little endian. | |
| Type | utf::endianness::network_byte_order = Type::BE |
| Convenience alias for network byte order (same as big endian) | |
| template<typename UtfType > | |
| utf::ByteOriented = std::same_as<UtfType, Utf8> | |
| Concept for byte-oriented UTF encodings (UTF-8) | |
| template<typename UtfType > | |
| utf::MultiByteOriented = std::same_as<UtfType, Utf16> || std::same_as<UtfType, Utf32> | |
| Concept for multi-byte UTF encodings (UTF-16, UTF-32) | |
| template<typename UtfType , Endian E> | |
| utf::ValidEndianness | |
| Concept validating endianness for a given encoding. | |
| template<typename T > | |
| utf::IsCodePoint | |
| Concept to check if a type is a valid CodePoint instantiation. | |
| template<Endian E> | |
| ByteOriented< Utf8 > && | utf::E |
| UTF-8 code point representation. | |
| static Endian | utf::endianness = E |
| std::array< uint8_t, 4 > | utf::rune {} |
| UTF-8 encoded bytes. | |
| #define UTF_CODEPOINT_HPP | +
| #define UTF_CODEPOINT_VERSION_MAJOR 0 | +
| #define UTF_CODEPOINT_VERSION_MINOR 0 | +
| #define UTF_CODEPOINT_VERSION_PATCH 2 | +
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
+Classes | |
| class | CodePointIterator |
| Iterator for traversing UTF-encoded strings as code points. More... | |
| class | SmallStringBuffer |
| Small buffer optimization for UTF strings. More... | |
| class | String |
| Owning container for UTF-encoded strings with Small String Optimization. More... | |
| class | StringView |
| Non-owning view of a UTF-encoded string. More... | |
+Typedefs | |
| using | Utf8StringView = StringView< Utf8, Endian::None > |
| using | Utf16BEStringView = StringView< Utf16, Endian::BE > |
| using | Utf16LEStringView = StringView< Utf16, Endian::LE > |
| using | Utf32BEStringView = StringView< Utf32, Endian::BE > |
| using | Utf32LEStringView = StringView< Utf32, Endian::LE > |
| using | Utf8String = String< Utf8, Endian::None > |
| using | Utf16BEString = String< Utf16, Endian::BE > |
| using | Utf16LEString = String< Utf16, Endian::LE > |
| using | Utf32BEString = String< Utf32, Endian::BE > |
| using | Utf32LEString = String< Utf32, Endian::LE > |
+Functions | |
| template<typename UtfType , Endian E> | |
| ValidEndianness< UtfType, E > String< UtfType, E > | operator+ (const String< UtfType, E > &lhs, const String< UtfType, E > &rhs) |
| Concatenate two strings of the same encoding. | |
| template<typename UtfType , Endian E> | |
| ValidEndianness< UtfType, E > String< UtfType, E > | operator+ (const String< UtfType, E > &lhs, StringView< UtfType, E > rhs) |
| Concatenate string with string view. | |
| template<typename UtfType , Endian E> | |
| ValidEndianness< UtfType, E > String< UtfType, E > | operator+ (StringView< UtfType, E > lhs, const String< UtfType, E > &rhs) |
| Concatenate string view with string. | |
| template<typename UtfType , Endian E> | |
| ValidEndianness< UtfType, E > String< UtfType, E > | operator+ (const String< UtfType, E > &lhs, const CodePoint< UtfType, E > &rhs) |
| Concatenate string with code point. | |
| template<typename UtfType , Endian E> | |
| ValidEndianness< UtfType, E > String< UtfType, E > | operator+ (const CodePoint< UtfType, E > &lhs, const String< UtfType, E > &rhs) |
| Concatenate code point with string. | |
| template<typename DestString , typename SrcUtfType , Endian SrcEndian> | |
| ValidEndianness< SrcUtfType, SrcEndian > std::optional< DestString > | convert_string (StringView< SrcUtfType, SrcEndian > source) |
| Convert a UTF string to a different encoding. | |
| template<typename DestString , typename SrcUtfType , Endian SrcEndian> | |
| ValidEndianness< SrcUtfType, SrcEndian > DestString | convert_string_unchecked (StringView< SrcUtfType, SrcEndian > source) |
| Convert a UTF string without validation (fast path) | |
| template<typename SrcUtfType , Endian SrcEndian> | |
| std::optional< Utf8String > | to_utf8_string (StringView< SrcUtfType, SrcEndian > source) |
| Convert any UTF string to UTF-8. | |
| template<typename SrcUtfType , Endian SrcEndian> | |
| std::optional< Utf16BEString > | to_utf16_be_string (StringView< SrcUtfType, SrcEndian > source) |
| Convert any UTF string to UTF-16 BE. | |
| template<typename SrcUtfType , Endian SrcEndian> | |
| std::optional< Utf16LEString > | to_utf16_le_string (StringView< SrcUtfType, SrcEndian > source) |
| Convert any UTF string to UTF-16 LE. | |
| template<typename SrcUtfType , Endian SrcEndian> | |
| std::optional< Utf32BEString > | to_utf32_be_string (StringView< SrcUtfType, SrcEndian > source) |
| Convert any UTF string to UTF-32 BE. | |
| template<typename SrcUtfType , Endian SrcEndian> | |
| std::optional< Utf32LEString > | to_utf32_le_string (StringView< SrcUtfType, SrcEndian > source) |
| Convert any UTF string to UTF-32 LE. | |
| std::optional< Utf8String > | utf8_string_from_bytes (const uint8_t *bytes, size_t byte_count) |
| Create UTF-8 string from byte array. | |
| std::optional< Utf8String > | utf8_string_from_bytes (const std::vector< uint8_t > &bytes) |
| Create UTF-8 string from byte vector. | |
| std::optional< Utf16BEString > | utf16_be_string_from_bytes (const uint8_t *bytes, size_t byte_count) |
| Create UTF-16 BE string from byte array. | |
| std::optional< Utf16BEString > | utf16_be_string_from_bytes (const std::vector< uint8_t > &bytes) |
| Create UTF-16 BE string from byte vector. | |
| std::optional< Utf16LEString > | utf16_le_string_from_bytes (const uint8_t *bytes, size_t byte_count) |
| Create UTF-16 LE string from byte array. | |
| std::optional< Utf16LEString > | utf16_le_string_from_bytes (const std::vector< uint8_t > &bytes) |
| Create UTF-16 LE string from byte vector. | |
| std::optional< Utf32BEString > | utf32_be_string_from_bytes (const uint8_t *bytes, size_t byte_count) |
| Create UTF-32 BE string from byte array. | |
| std::optional< Utf32BEString > | utf32_be_string_from_bytes (const std::vector< uint8_t > &bytes) |
| Create UTF-32 BE string from byte vector. | |
| std::optional< Utf32LEString > | utf32_le_string_from_bytes (const uint8_t *bytes, size_t byte_count) |
| Create UTF-32 LE string from byte array. | |
| std::optional< Utf32LEString > | utf32_le_string_from_bytes (const std::vector< uint8_t > &bytes) |
| Create UTF-32 LE string from byte vector. | |
| using utf::string::Utf16BEString = typedef String<Utf16, Endian::BE> | +
| using utf::string::Utf16BEStringView = typedef StringView<Utf16, Endian::BE> | +
| using utf::string::Utf16LEString = typedef String<Utf16, Endian::LE> | +
| using utf::string::Utf16LEStringView = typedef StringView<Utf16, Endian::LE> | +
| using utf::string::Utf32BEString = typedef String<Utf32, Endian::BE> | +
| using utf::string::Utf32BEStringView = typedef StringView<Utf32, Endian::BE> | +
| using utf::string::Utf32LEString = typedef String<Utf32, Endian::LE> | +
| using utf::string::Utf32LEStringView = typedef StringView<Utf32, Endian::LE> | +
| using utf::string::Utf8String = typedef String<Utf8, Endian::None> | +
| using utf::string::Utf8StringView = typedef StringView<Utf8, Endian::None> | +
| ValidEndianness< SrcUtfType, SrcEndian > std::optional< DestString > utf::string::convert_string | +( | +StringView< SrcUtfType, SrcEndian > | +source | ) | ++ |
Convert a UTF string to a different encoding.
+| DestString | The destination string type |
| SrcUtfType | The source UTF encoding type |
| SrcEndian | The source endianness |
| source | The source string view |
| ValidEndianness< SrcUtfType, SrcEndian > DestString utf::string::convert_string_unchecked | +( | +StringView< SrcUtfType, SrcEndian > | +source | ) | ++ |
Convert a UTF string without validation (fast path)
+| ValidEndianness< UtfType, E > String< UtfType, E > utf::string::operator+ | +( | +const CodePoint< UtfType, E > & | +lhs, | +
| + | + | const String< UtfType, E > & | +rhs | +
| + | ) | ++ |
Concatenate code point with string.
+ +| ValidEndianness< UtfType, E > String< UtfType, E > utf::string::operator+ | +( | +const String< UtfType, E > & | +lhs, | +
| + | + | const CodePoint< UtfType, E > & | +rhs | +
| + | ) | ++ |
Concatenate string with code point.
+ +| ValidEndianness< UtfType, E > String< UtfType, E > utf::string::operator+ | +( | +const String< UtfType, E > & | +lhs, | +
| + | + | const String< UtfType, E > & | +rhs | +
| + | ) | ++ |
Concatenate two strings of the same encoding.
+ +| ValidEndianness< UtfType, E > String< UtfType, E > utf::string::operator+ | +( | +const String< UtfType, E > & | +lhs, | +
| + | + | StringView< UtfType, E > | +rhs | +
| + | ) | ++ |
Concatenate string with string view.
+ +| ValidEndianness< UtfType, E > String< UtfType, E > utf::string::operator+ | +( | +StringView< UtfType, E > | +lhs, | +
| + | + | const String< UtfType, E > & | +rhs | +
| + | ) | ++ |
Concatenate string view with string.
+ +| std::optional< Utf16BEString > utf::string::to_utf16_be_string | +( | +StringView< SrcUtfType, SrcEndian > | +source | ) | ++ |
Convert any UTF string to UTF-16 BE.
+ +| std::optional< Utf16LEString > utf::string::to_utf16_le_string | +( | +StringView< SrcUtfType, SrcEndian > | +source | ) | ++ |
Convert any UTF string to UTF-16 LE.
+ +| std::optional< Utf32BEString > utf::string::to_utf32_be_string | +( | +StringView< SrcUtfType, SrcEndian > | +source | ) | ++ |
Convert any UTF string to UTF-32 BE.
+ +| std::optional< Utf32LEString > utf::string::to_utf32_le_string | +( | +StringView< SrcUtfType, SrcEndian > | +source | ) | ++ |
Convert any UTF string to UTF-32 LE.
+ +| std::optional< Utf8String > utf::string::to_utf8_string | +( | +StringView< SrcUtfType, SrcEndian > | +source | ) | ++ |
Convert any UTF string to UTF-8.
+ +
+
|
+ +inline | +
Create UTF-16 BE string from byte vector.
+| bytes | Vector containing UTF-16 BE encoded bytes |
+
|
+ +inline | +
Create UTF-16 BE string from byte array.
+| bytes | Pointer to UTF-16 BE encoded bytes |
| byte_count | Number of bytes (must be even) |
+
|
+ +inline | +
Create UTF-16 LE string from byte vector.
+| bytes | Vector containing UTF-16 LE encoded bytes |
+
|
+ +inline | +
Create UTF-16 LE string from byte array.
+| bytes | Pointer to UTF-16 LE encoded bytes |
| byte_count | Number of bytes (must be even) |
+
|
+ +inline | +
Create UTF-32 BE string from byte vector.
+| bytes | Vector containing UTF-32 BE encoded bytes |
+
|
+ +inline | +
Create UTF-32 BE string from byte array.
+| bytes | Pointer to UTF-32 BE encoded bytes |
| byte_count | Number of bytes (must be multiple of 4) |
+
|
+ +inline | +
Create UTF-32 LE string from byte vector.
+| bytes | Vector containing UTF-32 LE encoded bytes |
+
|
+ +inline | +
Create UTF-32 LE string from byte array.
+| bytes | Pointer to UTF-32 LE encoded bytes |
| byte_count | Number of bytes (must be multiple of 4) |
+
|
+ +inline | +
Create UTF-8 string from byte vector.
+| bytes | Vector containing UTF-8 encoded bytes |
+
|
+ +inline | +
Create UTF-8 string from byte array.
+| bytes | Pointer to UTF-8 encoded bytes |
| byte_count | Number of bytes |
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
Central UTF Strings library header - main API entry point. +More...
+#include "utf/export.hpp"#include "utf/utf_codepoints.hpp"#include "utf/utf_strings.hpp"#include "utf/version.hpp"Go to the source code of this file.
++Namespaces | |
| namespace | utf |
| Root namespace for the UTF Strings library. | |
+Macros | |
| #define | UTF_HPP |
| #define | UTF_VERSION_MAJOR 0 |
| #define | UTF_VERSION_MINOR 0 |
| #define | UTF_VERSION_PATCH 2 |
| #define | UTF_VERSION_STRING "0.0.2" |
| #define | UTF_VERSION_NUMBER 2 |
| #define | UTF_VERSION_AT_LEAST(major, minor, patch) (UTF_VERSION_NUMBER >= ((major) * 10000 + (minor) * 100 + (patch))) |
+Functions | |
| const char * | utf::get_version () |
| Get the library version as a string. | |
| int | utf::get_version_number () |
| Get the library version as an integer. | |
| bool | utf::version_at_least (int major, int minor=0, int patch=0) |
| Check if the library version is at least the specified version. | |
Central UTF Strings library header - main API entry point.
+This is the primary header for the UTF Strings library. It provides a unified namespace and includes all necessary components for working with UTF-8, UTF-16, and UTF-32 strings with explicit endianness control.
+Features:
Requirements:
Example Usage:
| #define UTF_HPP | +
| #define UTF_VERSION_AT_LEAST | +( | ++ | major, | +
| + | + | + | minor, | +
| + | + | + | patch | +
| + | ) | +(UTF_VERSION_NUMBER >= ((major) * 10000 + (minor) * 100 + (patch))) | +
| #define UTF_VERSION_MAJOR 0 | +
| #define UTF_VERSION_MINOR 0 | +
| #define UTF_VERSION_NUMBER 2 | +
| #define UTF_VERSION_PATCH 2 | +
| #define UTF_VERSION_STRING "0.0.2" | +
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
UTF-32 encoding specification. + More...
+ +#include <utf_codepoints.hpp>
+Public Types | |
| using | storage_type = uint32_t |
+Static Public Attributes | |
| static std::size_t | unit_size = 4 |
| static std::size_t | max_units = 1 |
UTF-32 encoding specification.
+| using utf::encodings::Utf32::storage_type = uint32_t | +
+
|
+ +static | +
+
|
+ +static | +
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
Version information for the UTF Strings library. + More...
+ +#include <version.hpp>
+Static Public Member Functions | |
| static const char * | string () |
| Get version as string in format "major.minor.patch". | |
| static int | number () |
| Get version as integer in format MAJOR*10000 + MINOR*100 + PATCH. | |
+Static Public Attributes | |
| static int | major = 0 |
| static int | minor = 0 |
| static int | patch = 2 |
Version information for the UTF Strings library.
+Get version as integer in format MAJOR*10000 + MINOR*100 + PATCH.
+ +Get version as string in format "major.minor.patch".
+ +
+
|
+ +static | +
+
|
+ +static | +
+
|
+ +static | +
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
This is the complete list of members for utf::string::SmallStringBuffer< StorageType >, including all inherited members.
+|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
This is the complete list of members for utf::encodings::Utf16, including all inherited members.
+| max_units | utf::encodings::Utf16 | static |
| storage_type typedef | utf::encodings::Utf16 | |
| unit_size | utf::encodings::Utf16 | static |
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
This is the complete list of members for utf::UnicodeScalar, including all inherited members.
+| is_valid() const | utf::UnicodeScalar | inline |
| operator uint32_t() const | utf::UnicodeScalar | inline |
| UnicodeScalar(uint32_t v) | utf::UnicodeScalar | inlineexplicit |
| value | utf::UnicodeScalar |
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
Non-owning view of a UTF-encoded string. + More...
+ +#include <utf_strings.hpp>
+Public Types | |
| using | value_type = CodePoint< UtfType, E > |
| using | size_type = std::size_t |
| using | storage_type = typename UtfType::storage_type |
| using | iterator = CodePointIterator< UtfType, E > |
| using | const_iterator = iterator |
| using | string_type = String< UtfType, E > |
+Public Member Functions | |
| StringView ()=default | |
| Default constructor creates an empty view. | |
| StringView (const storage_type *data, size_type length) | |
| Construct from pointer and length (in storage units) | |
| StringView (const storage_type *data) | |
| Construct from null-terminated string. | |
| template<typename Traits , typename Allocator > | |
| StringView (const std::basic_string< storage_type, Traits, Allocator > &str) | |
| Construct from std::basic_string. | |
| template<typename Traits > | |
| StringView (std::basic_string_view< storage_type, Traits > sv) | |
| Construct from std::basic_string_view. | |
| const storage_type * | data () const |
| Get pointer to the underlying data. | |
| size_type | length () const |
| Get the length in storage units (not code points!) | |
| size_type | size () const |
| Get the size in storage units (alias for length()) | |
| size_type | size_bytes () const |
| Get the size in bytes. | |
| bool | empty () const |
| Check if the view is empty. | |
| iterator | begin () const |
| Get iterator to the beginning. | |
| iterator | end () const |
| Get iterator to the end. | |
| size_type | count_code_points () const |
| Count the number of code points in the string. | |
| bool | is_valid () const |
| Validate the entire string. | |
| std::basic_string_view< storage_type > | to_std_string_view () const |
| Convert to std::basic_string_view. | |
| StringView | substr (size_type pos, size_type count=std::string_view::npos) const |
| Create a substring view. | |
| bool | operator== (const StringView &other) const |
| Equality comparison. | |
| std::strong_ordering | operator<=> (const StringView &other) const |
| Three-way comparison. | |
Non-owning view of a UTF-encoded string.
+| UtfType | The UTF encoding type (Utf8, Utf16, or Utf32) |
| E | The endianness (Endian::None for UTF-8, BE or LE for UTF-16/32) |
| using utf::string::StringView< UtfType, E >::const_iterator = iterator | +
| using utf::string::StringView< UtfType, E >::iterator = CodePointIterator<UtfType, E> | +
| using utf::string::StringView< UtfType, E >::size_type = std::size_t | +
| using utf::string::StringView< UtfType, E >::storage_type = typename UtfType::storage_type | +
+
|
+ +default | +
Default constructor creates an empty view.
+ +
+
|
+ +inline | +
Construct from pointer and length (in storage units)
+ +
+
|
+ +inlineexplicit | +
Construct from null-terminated string.
+
+
|
+ +inline | +
Construct from std::basic_string.
+ +
+
|
+ +inline | +
Construct from std::basic_string_view.
+ +
+
|
+ +inline | +
Get iterator to the beginning.
+ +
+
|
+ +inline | +
Count the number of code points in the string.
+
+
|
+ +inline | +
Get pointer to the underlying data.
+ +
+
|
+ +inline | +
Check if the view is empty.
+ +
+
|
+ +inline | +
Get iterator to the end.
+ +
+
|
+ +inline | +
Validate the entire string.
+
+
|
+ +inline | +
Get the length in storage units (not code points!)
+ +
+
|
+ +inline | +
Three-way comparison.
+ +
+
|
+ +inline | +
Equality comparison.
+ +
+
|
+ +inline | +
Get the size in storage units (alias for length())
+ +
+
|
+ +inline | +
Get the size in bytes.
+ +
+
|
+ +inline | +
Create a substring view.
+ +
+
|
+ +inline | +
Convert to std::basic_string_view.
+ +|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
Strong type wrapper for Unicode scalar values. + More...
+ +#include <utf_codepoints.hpp>
+Public Member Functions | |
| UnicodeScalar (uint32_t v) | |
| Construct from a raw integer value. | |
| bool | is_valid () const |
| Check if this represents a valid Unicode scalar value. | |
| operator uint32_t () const | |
| Implicit conversion to uint32_t. | |
+Public Attributes | |
| uint32_t | value |
| The Unicode scalar value. | |
Strong type wrapper for Unicode scalar values.
+Provides type safety to distinguish Unicode scalars from raw integers
+
+
|
+ +inlineexplicit | +
Construct from a raw integer value.
+ +
+
|
+ +inline | +
Check if this represents a valid Unicode scalar value.
+
+
|
+ +inline | +
Implicit conversion to uint32_t.
+ +| uint32_t utf::UnicodeScalar::value | +
The Unicode scalar value.
+ +|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
| File in include | Includes file in include/utf |
|---|---|
| utf.hpp | export.hpp |
| utf.hpp | utf_codepoints.hpp |
| utf.hpp | utf_strings.hpp |
| utf.hpp | version.hpp |
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
| File in src | Includes file in include |
|---|---|
| utf_codepoints.cpp | utf / utf_codepoints.hpp |
| utf_strings.cpp | utf / utf_strings.hpp |
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
+Directories | |
| dev | |
| performance | |
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
+Files | |
| utf_codepoints.cpp | |
| utf_strings.cpp | |
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
+Files | |
| export.hpp | |
| utf_codepoints.hpp | |
| utf_strings.hpp | |
| version.hpp | |
| UTF Strings library version information. | |
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
| ▼ docs | |
| ▼ dev | |
| bench | |
| performance | |
| ▼ include | |
| ▼ utf | |
| export.hpp | |
| utf_codepoints.hpp | |
| utf_strings.hpp | |
| version.hpp | UTF Strings library version information |
| utf.hpp | Central UTF Strings library header - main API entry point |
| ▼ src | |
| utf_codepoints.cpp | |
| utf_strings.cpp |
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
This page explains how to interpret the graphs that are generated by doxygen.
+Consider the following example:
This will result in the following graph:
+The boxes in the above graph have the following meaning:
+The arrows have the following meaning:
+|
+ UTF Strings Library v1.3.0
+
+ High-performance UTF string processing library for C++23
+ |
+
Modern C++23 UTF utilities (UTF-8/16/32) with explicit endianness policy and comprehensive testing.
+Cross-Platform Support:
Key Features:
Integrated Security Scanning:
Linux/macOS:
Windows (cmd/PowerShell):
That's it! The bootstrap script will:
If you prefer manual control:
+The build system supports fine-grained compiler control through external flags:
+Available Configuration Flags:
COMPILER_TYPE: GCC|CLANG|MSVC - Explicit compiler identificationUSE_LTO: ON|OFF - Link Time OptimizationUSE_NATIVE_ARCH: ON|OFF - Native CPU optimization (-march=native)USE_MSVC_LTO: ON|OFF - MSVC-specific LTO flags (/LTCG, /GL)USE_LIBC_PLUS_PLUS: ON|OFF - Use libc++ instead of libstdc++ (Clang only)ENABLE_SHARED_LIBRARY: ON|OFF - Build shared librariesThis project includes comprehensive context files to help AI assistants (ChatGPT, Claude, GitHub Copilot) provide accurate assistance:
+.ai-context** - Complete project overview, standards, and guidelines.copilot-instructions.md** - GitHub Copilot-specific coding patterns and conventionsWhen working with AI assistants on this project:
+The AI context files ensure consistent, project-appropriate assistance across all AI platforms! π
+This project maintains high standards for production-ready C++ code:
+Every change must meet:
See CONTRIBUTING.md for complete development guidelines.
+