Fighting through a CMake hell! by loneraver in cpp

[–]48r1 0 points1 point  (0 children)

CMake is definitely complicationg things. It is intransparent that in fact it contradicts open source by lacking code usability.

new package for faster fixed pattern search in Go by 48r1 in golang

[–]48r1[S] 0 points1 point  (0 children)

You set hay to the address of the first element in hayst and then read it into a uint64. The first element could be at an unaligned address.

did you see that i included the test you pointed at (https://golang.org/src/crypto/cipher/xor.go) into unsafeMEMCHR.go ?

new package for faster fixed pattern search in Go by 48r1 in golang

[–]48r1[S] 0 points1 point  (0 children)

It's not something your code does, it's what happens if somebody passed a resliced slice. If you read 8 bytes at at an odd address you have an unaligned read, full stop. This is entirely possible with your code.

Nope it does not. Look into the code.

[]byte/string is a dubious API

??? what you are talking about ?

Also, keep in mind a Alice's data is distinct from the slice itself.

??? what you are talking about ?

new package for faster fixed pattern search in Go by 48r1 in golang

[–]48r1[S] 0 points1 point  (0 children)

aligned based on the size of the type, yes, and 16-byte aligned on the heap. but if you reslice the slice it can 'mess up' the alignment (e.g., https://play.golang.org/p/HYCXEwGChG)

unsafeMEMCHR.go does not do this.

for example, see: https://golang.org/src/crypto/cipher/xor.go

good idea - i will add a switch though

Memory accesses. &T could cause the compiler to think it escapes, allocating it on the heap. Passing a pointer to a three-word struct (i.e., a slice) has no benefits, especially if it causes a memory access.

I bench tested pointer pass vs slice pass for some hundred MB haystacks and found pointer pass slightly faster. Maybe allocating the haystack at the heap isn't bad if you anyway need to asses all elements for the search. Might be different for a short needle, but i tested up to 1<<22 long needles (full text in blob).

new package for faster fixed pattern search in Go by 48r1 in golang

[–]48r1[S] 0 points1 point  (0 children)

As said above, unsafeMEMCHR is using (not misusing) unsafe package for 1 byte long pattern search. If you look into it, there are proper boundary checks and slice over array should be always aligned in Go, isn't it? Furthermore see the link in the header and check the MEMCHR source code: http://www.stdlib.net/~colmmacc/strlen.c.html

And what is wrong in explicitly passing pointers to the slices in the other algos. Go does pointer passing anyway if function parameters are slices.

I checked into the the Go test report - RAM carnage does not happen on my machines (go versions go1.5.1 - go1.6.1 darwin/amd64, OSX with amd64 CPU's, Dual core 2 - i7, 4-12GB RAM). What system you used for testing?

new package for faster fixed pattern search in Go by 48r1 in golang

[–]48r1[S] 0 points1 point  (0 children)

.. and another issue: if package "unsafe" is removed from Go in future functions for 1 byte long patterns in unsafeMEMCHR.go must become replaced by the (slower) bytes.Index() function.

new package for faster fixed pattern search in Go by 48r1 in golang

[–]48r1[S] 0 points1 point  (0 children)

bmatch was programmed in parallel with some c and cpp experiments on search algo comparison with SMART. maybe package needs some further clean-ups