Replace the single interleaved loop calling NthOrEmpty per element
with separate per-slice loops using direct index access. This
eliminates function call overhead (NthOrEmpty → sliceNth → bounds
check) and improves CPU cache locality by processing each input
slice contiguously.
The result slice is zero-initialized by make(), so out-of-bounds
elements (when slices have different lengths) are already zero —
no explicit zero-filling needed.
Benchstat (Apple M3, 6 runs, -cpu=1):
│ before │ after │
│ sec/op │ sec/op vs base │
Zip2_Equal/n_10 65.89n ± 22% 52.35n ± 13% -20.54% (p=0.002)
Zip2_Equal/n_100 440.6n ± 21% 382.3n ± 13% -13.22% (p=0.004)
Zip2_Equal/n_1000 4.232µ ± 82% 3.173µ ± 12% -25.02% (p=0.002)
Zip2_Unequal/n_10 69.35n ± 65% 46.16n ± 1% -33.43% (p=0.002)
Zip2_Unequal/n_100 461.8n ±101% 293.1n ± 17% -36.53% (p=0.002)
Zip2_Unequal/n_1000 3.623µ ± 26% 2.301µ ± 17% -36.49% (p=0.002)
geomean 492.4n 354.3n -28.05%
- Replace Nth() with NthOrEmpty() to avoid unnecessary error allocation on miss
- Change make(0, size) + append to make(size) + direct assignment
- Use uint for loop indices
- Remove temporary variables in ZipBy functions
This reduces overhead from append operations and structure creation,
similar to the Zip5Copy2 optimization pattern.
Reduces code size: +174/-242 lines
* Fix linting
* Use is.ElementsMatch
This will ignore the ordering of the final intersection. Especially
important when checking old versions of go that do not guarantee an order
when iterating through maps.
* lint: fix inconsistent callback function parameter names
* lint: rename "iteratee" to "transform" for *Map helpers
* lint: rename "project" parameters to "transform"
* lint: rename "cb" parameters to "callback"
* lint: rename "iteratee" to "callback" for ForEach helpers
---------
Co-authored-by: Franky W. <frankywahl@users.noreply.github.com>
Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>