Commit Graph

2 Commits

Author SHA1 Message Date
d-enk 68f827d9bf perf: optimize Substring to work directly with strings instead of converting to runes (#822)
* perf: optimize Substring to work directly with strings instead of converting to runes

- Rewrite Substring to iterate over string bytes directly, avoiding full []rune conversion
- Improve performance for long strings by only processing necessary portions
- Add comprehensive test cases for Unicode handling, invalid UTF-8, and edge cases
- Add BenchmarkSubstring to measure performance improvements
- Improve documentation with detailed parameter descriptions
- Handle invalid UTF-8 sequences by converting to []rune when needed

Bencstat:

                   │    old.txt    │               new.txt               │
                   │    sec/op     │    sec/op     vs base               │
Substring/{10_10}-4    558.85n ±  9%   39.75n ± 10%  -92.89% (p=0.000 n=8)
Substring/{50_50}-4    783.10n ±  6%   85.15n ±  5%  -89.13% (p=0.000 n=8)
Substring/{50_45}-4    773.30n ±  3%   126.5n ±  7%  -83.65% (p=0.000 n=8)
Substring/{-50_50}-4   794.00n ±  2%   177.6n ±  7%  -77.63% (p=0.000 n=8)
Substring/{-10_10}-4   542.85n ± 20%   41.82n ±  6%  -92.30% (p=0.000 n=8)
geomean               680.4n         79.52n        -88.31%

                   │  old.txt   │               new.txt                │
                   │    B/op    │   B/op    vs base                    │
Substring/{10_10}-4    432.0 ± 0%   0.0 ± 0%  -100.00% (p=0.000 n=8)
Substring/{50_50}-4    480.0 ± 0%   0.0 ± 0%  -100.00% (p=0.000 n=8)
Substring/{50_45}-4    464.0 ± 0%   0.0 ± 0%  -100.00% (p=0.000 n=8)
Substring/{-50_50}-4   480.0 ± 0%   0.0 ± 0%  -100.00% (p=0.000 n=8)
Substring/{-10_10}-4   432.0 ± 0%   0.0 ± 0%  -100.00% (p=0.000 n=8)

                   │  old.txt   │                new.txt                 │
                   │ allocs/op  │ allocs/op   vs base                    │
Substring/{10_10}-4    2.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=8)
Substring/{50_50}-4    2.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=8)
Substring/{50_45}-4    2.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=8)
Substring/{-50_50}-4   2.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=8)
Substring/{-10_10}-4   2.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=8)

* Enhance substring documentation with Unicode details

Returns a substring starting at the given offset with the specified length. Supports negative offsets; out-of-bounds are clamped. Operates on Unicode runes (characters) and is optimized for zero allocations.

---------

Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>
2026-02-27 22:19:20 +01:00
Samuel Berthe d55616266c doc: documentating every helpers of core package 2025-10-06 17:15:49 +02:00