Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
77 changes: 75 additions & 2 deletions .ai-context
Original file line number Diff line number Diff line change
Expand Up @@ -80,9 +80,9 @@ cd build/build && ./utf_strings-bench
### Key Files to Understand
- `CMakeLists.txt`: Main build configuration
- `conanfile.py`: Dependency management
- `include/utf/utf_strings.hpp`: Main library header
- `include/utf/utf_codepoints.hpp`: Main library header
- `include/utf/version.hpp`: Version information
- `src/utf_strings.cpp`: Core implementation
- `src/utf_codepoints.cpp`: Core implementation

### Common Tasks
1. **Adding new UTF conversion functions**: Update both header and implementation
Expand All @@ -99,6 +99,79 @@ cd build/build && ./utf_strings-bench
### Platform Support
- **Linux**: GCC 13+, Clang 18+ (primary development)
- **Windows**: MSVC 2022, Clang-CL (CI tested)

### 🚨 MANDATORY CODE REVIEW REQUIREMENTS

**CRITICAL**: All code must undergo comprehensive review before every push to origin using these parameters:

#### **Code Review Parameters**

1. **πŸ”’ Security Analysis** (HIGHEST PRIORITY)
- [ ] Check for undefined behavior (UB) - memory safety violations
- [ ] Validate memory safety - no dangling pointers, use-after-free
- [ ] Look for buffer overflows and underflows
- [ ] Check for integer overflows/underflows
- [ ] Assess input validation (bounds checking, null checks)
- [ ] Verify proper error handling for security-critical paths

2. **⚑ Performance Analysis**
- [ ] Evaluate algorithmic complexity (Big-O analysis)
- [ ] Check for unnecessary allocations and copies
- [ ] Identify performance bottlenecks
- [ ] Verify move semantics used appropriately
- [ ] Check for redundant temporary objects

3. **πŸ› Correctness Issues**
- [ ] Identify bugs and logic errors
- [ ] Check edge cases (empty inputs, max values, boundary conditions)
- [ ] Validate error handling (exceptions, optional returns)
- [ ] Ensure proper initialization of all variables
- [ ] Check const correctness and immutability

4. **πŸš€ C++ Core Guidelines & Modern C++23**
- [ ] Validate RAII usage (Resource Acquisition Is Initialization)
- [ ] Check exception safety (basic/strong/no-throw guarantee)
- [ ] Verify proper use of `[[nodiscard]]`, `noexcept` attributes
- [ ] Check for appropriate use of `std::optional`, `std::variant`
- [ ] Assess template metaprogramming best practices

5. **πŸ—οΈ Design Issues**
- [ ] Evaluate API design consistency
- [ ] Check naming conventions consistency
- [ ] Assess abstraction levels and encapsulation
- [ ] Review for unnecessary complexity or over-engineering
- [ ] Verify dependency management

6. **πŸ“š Documentation & Testing**
- [ ] Check for adequate inline comments
- [ ] Verify API documentation completeness
- [ ] Ensure comprehensive unit test coverage
- [ ] Check for edge case testing
- [ ] Validate error path testing

#### **Review Output Requirements**

**Categorize ALL findings by severity:**
- πŸ”΄ **Critical** - Must fix before production (security, UB, crashes)
- 🟑 **Important** - Should fix (performance, correctness, maintainability)
- 🟒 **Nice to have** - Optional improvements (style, minor optimizations)

**Provide Score Card:**
Rate each category (A+ to F) with overall grade and production readiness assessment.

**For Critical Issues:**
- Provide exact code fixes
- Explain the issue clearly
- Show before/after code examples

#### **Key Focus Areas for UTF Strings Library:**
- **Production readiness** - Code must be deployable
- **Security** - Especially UB, overflow, memory safety in UTF processing
- **Performance optimization** - UTF processing efficiency is critical
- **Modern C++23 practices** - Leverage language features appropriately
- **Cross-platform compatibility** - Must work on Linux/Windows/macOS

**Standard**: Every change must have ZERO πŸ”΄ Critical issues before merging.
- **Architectures**: x64 (primary), others untested

### Performance Considerations
Expand Down
63 changes: 59 additions & 4 deletions .copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,9 +143,64 @@ When suggesting build changes:
- Add appropriate compiler flags for new features
- Update CI workflows if needed

## 🚨 MANDATORY PRE-PUSH CODE REVIEW

**ABSOLUTE REQUIREMENT**: Every code change must undergo comprehensive review using these parameters before ANY push to origin:

### **Security Review (CRITICAL - ZERO TOLERANCE)**
- [ ] **Memory Safety**: No dangling pointers, use-after-free, or memory leaks
- [ ] **Undefined Behavior**: No UB in any code path (especially UTF processing)
- [ ] **Buffer Safety**: No overflows/underflows in byte array operations
- [ ] **Integer Safety**: No overflows in size calculations or conversions
- [ ] **Input Validation**: All external input properly validated and bounds-checked

### **Performance & Correctness Review**
- [ ] **Algorithm Efficiency**: Optimal Big-O complexity for UTF operations
- [ ] **Memory Efficiency**: No unnecessary allocations or copies
- [ ] **Edge Case Handling**: Empty strings, max values, boundary conditions
- [ ] **Error Handling**: Proper exception safety and error propagation
- [ ] **Move Semantics**: Efficient resource management

### **Modern C++23 Compliance**
- [ ] **Feature Usage**: Appropriate use of concepts, constexpr, std::expected
- [ ] **RAII Compliance**: Proper resource management
- [ ] **API Design**: Consistent with project patterns and std library
- [ ] **Template Design**: Proper constraints and error messages

### **Review Output Format**
**MUST categorize ALL findings:**
- πŸ”΄ **Critical** - BLOCKS merge (security, UB, crashes)
- 🟑 **Important** - Should fix (performance, correctness)
- 🟒 **Optional** - Nice to have (style improvements)

**For πŸ”΄ Critical issues - provide:**
1. Exact description of the problem
2. Security/safety implications
3. Complete fix with before/after code
4. Verification steps

### **UTF Strings Library Specific Focus**
- **UTF Processing Safety**: Validate all UTF sequence handling
- **Endianness Handling**: Correct byte order management
- **Performance Critical**: UTF conversion efficiency
- **Cross-Platform**: Works on Linux/Windows/macOS
- **Production Ready**: Zero critical issues before merge

### **Pre-Push Validation Checklist**
**MUST complete ALL before any `git push`:**
1. βœ… Complete security review (zero πŸ”΄ Critical issues)
2. βœ… All builds pass (`./bootstrap_cmake.sh --compiler clang`)
3. βœ… All tests pass (100% success rate)
4. βœ… Benchmarks run without regression
5. βœ… Code formatting compliant
6. βœ… License headers present on all new files
7. βœ… Comprehensive unit tests for all new code

**Standard**: **ZERO TOLERANCE** for πŸ”΄ Critical security or correctness issues.

## Current Focus Areas
- UTF validation algorithms
- Conversion between UTF-8/16/32
- Performance optimization
- UTF validation algorithms (with security focus)
- Conversion between UTF-8/16/32 (performance critical)
- Memory safety in byte array operations
- Cross-platform compatibility
- Security hardening
- Security hardening and review compliance
2 changes: 1 addition & 1 deletion .github/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ utf_strings-v{version}-{Platform}/
β”œβ”€β”€ utf_strings-bench-release # Optimized benchmark executable
β”œβ”€β”€ utf_strings-bench-debug # Debug benchmark executable
β”œβ”€β”€ include/utf/ # Complete header files
β”‚ β”œβ”€β”€ utf_strings.hpp
β”‚ β”œβ”€β”€ utf_codepoints.hpp
β”‚ └── version.hpp
β”œβ”€β”€ *.a/*.lib # Static libraries
β”œβ”€β”€ LICENSE # License file
Expand Down
Loading
Loading