Conversation
|
I'm not sure why older GHCs are unable to infer the types for the tests I've added, since the types should all be trivially known (Text and Char). |
|
Thanks @axman6! I suggest we start with |
|
Yeah I've been working on rewriting the C to avoid going via memmem, and removing the twoway_memmem would significantly reduce the amount of code to maintain. I would guess there are faster memmem implementations out there, hopefully under permissive licenses too. I'll get the changes working and push those today. |
|
I have a suspicion that Anyways, let's separate concerns. From my perspective the first task is to add |
|
I'll try and find some time to write a Haskell only version, and then we can think about making a faster C one later. I wonder if it's worth having both, and only moving to the C call when there's enough data to justify it. |
Implements
codepointOffsetwith code from the FreeBSD project.I'm planning to explore making a vectorised implementation of the searching for 2, 3 and 4 char codepoints, but will leave that out in the first iteration.
This may be relevant to #369, by eliminating the need to decode codepoints via Haskell.