d-enk
68f827d9bf
perf: optimize Substring to work directly with strings instead of converting to runes ( #822 )
...
* perf: optimize Substring to work directly with strings instead of converting to runes
- Rewrite Substring to iterate over string bytes directly, avoiding full []rune conversion
- Improve performance for long strings by only processing necessary portions
- Add comprehensive test cases for Unicode handling, invalid UTF-8, and edge cases
- Add BenchmarkSubstring to measure performance improvements
- Improve documentation with detailed parameter descriptions
- Handle invalid UTF-8 sequences by converting to []rune when needed
Bencstat:
│ old.txt │ new.txt │
│ sec/op │ sec/op vs base │
Substring/{10_10}-4 558.85n ± 9% 39.75n ± 10% -92.89% (p=0.000 n=8)
Substring/{50_50}-4 783.10n ± 6% 85.15n ± 5% -89.13% (p=0.000 n=8)
Substring/{50_45}-4 773.30n ± 3% 126.5n ± 7% -83.65% (p=0.000 n=8)
Substring/{-50_50}-4 794.00n ± 2% 177.6n ± 7% -77.63% (p=0.000 n=8)
Substring/{-10_10}-4 542.85n ± 20% 41.82n ± 6% -92.30% (p=0.000 n=8)
geomean 680.4n 79.52n -88.31%
│ old.txt │ new.txt │
│ B/op │ B/op vs base │
Substring/{10_10}-4 432.0 ± 0% 0.0 ± 0% -100.00% (p=0.000 n=8)
Substring/{50_50}-4 480.0 ± 0% 0.0 ± 0% -100.00% (p=0.000 n=8)
Substring/{50_45}-4 464.0 ± 0% 0.0 ± 0% -100.00% (p=0.000 n=8)
Substring/{-50_50}-4 480.0 ± 0% 0.0 ± 0% -100.00% (p=0.000 n=8)
Substring/{-10_10}-4 432.0 ± 0% 0.0 ± 0% -100.00% (p=0.000 n=8)
│ old.txt │ new.txt │
│ allocs/op │ allocs/op vs base │
Substring/{10_10}-4 2.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=8)
Substring/{50_50}-4 2.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=8)
Substring/{50_45}-4 2.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=8)
Substring/{-50_50}-4 2.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=8)
Substring/{-10_10}-4 2.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=8)
* Enhance substring documentation with Unicode details
Returns a substring starting at the given offset with the specified length. Supports negative offsets; out-of-bounds are clamped. Operates on Unicode runes (characters) and is optimized for zero allocations.
---------
Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr >
2026-02-27 22:19:20 +01:00
Samuel Berthe
a602a36075
test: adding missing test cases to ellipsis ( #809 )
2026-02-21 22:56:05 +01:00
Samuel Berthe
7f2504a902
💄
2026-02-21 19:32:45 +01:00
Varun Chawla
0b4623da1e
fix: make Ellipsis operate on runes instead of bytes to prevent Unicode truncation ( #796 )
...
* fix: make Ellipsis operate on runes instead of bytes to prevent Unicode truncation
The Ellipsis function previously used byte-based length counting (len(str))
and byte-based slicing (str[:length-3]), which could split multi-byte
Unicode characters in the middle, producing garbled output.
This changes the function to use []rune conversion so the length parameter
counts Unicode code points instead of bytes. Emoji, CJK ideographs, and
other multi-byte characters are now never split in the middle.
Fixes #520
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
* refactor: avoid rune slice allocation in Ellipsis
Use range-based iteration to count runes without allocating a []rune
slice, per reviewer suggestion. The early-return for length < 3 is
kept explicit for clarity.
* Simplify Ellipsis: remove early return for length < 3, reuse ellipsis const
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com >
2026-02-21 19:29:10 +01:00
Samuel Berthe
fedd0b6d2d
doc: explain chunkstring inconsistency ( #789 )
...
* doc: explain chunkstring inconsistency
* doc: explain chunkstring inconsistency
2026-01-27 18:53:04 +01:00
d-enk
123d5c2531
refactor: remove some redundant checks ( #771 )
2026-01-12 20:42:12 +01:00
Nathan Baulch
43cef1f439
feat: new iter package ( #672 )
...
* lint: pin golangci-lint version
* lint: fix issues triggered by go1.23 upgrade
* feat: new iter package
* lint: fix linter issues
* fix: restore go1.18
* fix: rename package to "it"
* feat: assign multiple sequences of maps
* fix: panic in DropRight if n = 0
* docs: fix incorrect non-iter helper references
* feat: implement Invert helper
* feat: helpers for creating and checking empty sequences
* feat: implement Reverse helper
* feat: implement ReduceRight helper
* feat: implement Shuffle helper
* feat: implement Sample* helpers
* refactor: rename helpers with Seq convention
* feat: implement SeqToChannel2 helper
* feat: implement HasPrefix/HasSuffix helpers
* chore: port recent changes
* perf: only iterate collection once in Every
* refactor: reduce dupe code by reusing helpers internally
* perf: reuse internal Mode slice
* feat: implement Length helper
* chore: duplicate unit tests for *I helpers
* fix: omit duplicates in second Intersect list
* feat: intersect more than 2 sequences
* feat: implement Drain helper
* feat: implement Seq/Seq2 conversion helpers
* refactor: rename *Right* to *Last*
* chore: minor cleanup
* refactor: consistent predicate/transform parameter names
* perf: abort Slice/Subset once upper bound reached
* refactor: rename IsSortedByKey to IsSortedBy
* refactor: reuse more helpers internally
* feat: implement Cut* helpers
* feat: implement Trim* helpers
* perf: reduce allocations
* docs: describe iteration and allocation expectations
* Update .github/workflows/lint.yml
---------
Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr >
2025-10-02 19:23:16 +02:00
Nathan Baulch
1b92b5c7db
lint: enable 7 more linters ( #686 )
...
* lint: enable and fix perfsprint issues
* lint: enable and fix nolintlint issues
* lint: enable and fix godot issues
* lint: enable and fix thelper issues
* lint: enable and fix tparallel issues
* lint: enable and fix paralleltest issues
* lint: enable and fix predeclared issues
2025-09-25 13:18:25 +02:00
Samuel Berthe
268215359e
fix(string): fix division by zero ( #684 )
2025-09-25 04:21:56 +02:00
Nathan Baulch
7170719ec0
lint: unit test improvements ( #674 )
...
* lint: pin golangci-lint version
* lint: use is.Empty where possible
* lint: use is.ElementsMatch for unsorted slices
* lint: remove redundant is.Len assertions
* lint: use is.Zero to assert zero structs
* fix: misc assertion issues
* lint: more consistent test case pattern
* fix: reversed expect/actual assert values
* lint: use is.ErrorIs and is.EqualError for errors
* Update golangci-lint version in workflow
---------
Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr >
2025-09-24 21:02:52 +02:00
Nathan Baulch
b5e290abe0
fix: more consistent panic strings ( #678 )
...
* lint: pin golangci-lint version
* fix: more consistent panic strings
* Update golangci-lint version in workflow
Updated golangci-lint action version to v2.4.
---------
Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr >
2025-09-24 21:02:02 +02:00
Nathan Baulch
76b76a7adb
lint: Apply testifylint linter recommendations ( #669 )
2025-09-20 00:50:00 +02:00
Samuel Berthe
741cdfdb03
feat(ellipsis): trim after truncating
2024-08-19 00:14:36 +02:00
Samuel Berthe
40f630f33a
feat(ellipsis): trim before truncating
2024-08-18 16:36:51 +02:00
Samuel Berthe
de8e023551
fix: rename Elipse to Ellipsis
2024-08-18 16:33:20 +02:00
mr
1ca9c7b4e5
Update string.go ( #496 )
...
* Update string.go
more reasonable
* test
2024-07-17 19:08:46 +02:00
Samuel Berthe
e5e4f028e4
feat: adding Elipse ( #470 )
2024-06-27 15:42:03 +02:00
eiixy
266436bb40
feat: add string conversion functions ( #466 )
...
* feat: add string conversion functions
* fix: fix `Capitalize`, update tests
* fix: fix `Capitalize`, update tests
* update README.md
* update tests
* update `Capitalize`
* style: unify coding style
2024-06-27 12:56:08 +02:00
Samuel Berthe
9ec076e4f6
test: adding some tests to Substring (see #288 )
2023-03-20 17:59:52 +01:00
Liu Shuang
de3bccf5d0
fix: substring support utf8 character ( #327 )
2023-03-20 15:14:00 +01:00
Corentin Clabaut
a3c90f1ac4
Add RandomString ( #266 )
...
* Add RandomString
* PR update
2022-11-15 23:12:57 +01:00
Samuel Berthe
31f3bc3a85
test: parallel tests everywhere ( #228 )
2022-10-02 21:38:26 +02:00
Corentin Clabaut
6126b6497c
Implement ChunkString ( #188 )
...
Implement ChunkString
2022-07-29 11:38:33 +02:00
Samuel Berthe
94d54a8f47
feat: adding runelength
2022-05-01 00:22:36 +02:00