Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Author of the post here, hi! Funny to see this resurface again, I have made a number of changes to the transducers library since this blog post (see: https://wiki.call-cc.org/eggref/5/transducers).

A vector-reduce form would be trivial but icky, and I chose not to do it to not have to have the continuation safety discussion.

I am not sure what "continuation safety" refers to in this context but I wanted a library that would give me a good out-of-the-box experience and have support for commonly used Scheme types. I have not yet added any folders/collectors/transducers specific to some types (like anything related to streams or SRFI-69), but I think a broad swath of types and patterns are currently covered.

I think in particular my griped regarding vectors were that collectors such as `collect-vector`, `collect-u8vector`, etc. were not implemented. There is a chance to break out of these collectors using continuations but that's not really a good argument to not have them (I hope this is not what you're referring to!).

Anyway, if I read things correctly the complaint that srfi-171 has delete dupes and delete neighbor dupes forgets that transducers are not always used to or from a data structure. They are oblivious to context. That is why both are necessary.

I think this is exactly my argument: they are oblivious to context and actually do the wrong thing by default. I've seen this happen in Rust with users preferring `dedup` or `dedup_by` (from the Itertools crate) rather than just constructing a HashSet or BTreeSet. It almost always is used as a shortcut to save on a data structure, and time and again I've seen it break workflows because it requires that the chain of items is first sorted.

I think this is is particularly damning for a library that means to be general purpose. If users want to implement this themselves and maintain it within their own code-bases, they're certainly welcome to; however, I don't personally think making this kind of deduping "easy" helps folks in the general sense. You'd be better off collecting into a set or bag of some kind, and then transducing a second time.

From what I can see the only differences are ordering of clauses to make the transduce form generic and naming conventions. His library shadows a bunch of bindings in a non-compatible way. The transduce form is still not generic but moves the list-, vector-, generator- part of transduce into a "folder". Which is fine. But a generic dispatch would be nicer.

Shadowing bindings in a "non-compatible" way can be bad, but it also helps to make programs more clean. If you're using transducers across your codebase, you almost certainly aren't also using e.g. SRFI-1's filter.

As for generic dispatch: I agree wholeheartedly. I wish we had something like Clojure protocols that didn't suck. I've looked into ways to (ab)use variant records for this sort of thing, but you run into an open/closed problem on extending the API. This is really something that needs to be solved at the language level and something like COOPS / GOOPS incurs a degree of both conceptual and actual performance overhead that makes them somewhat unsatisfying :(

And also: thank you for SRFI-171. I disagree with some of the design decisions but had it not been written I probably wouldn't have even considered transducers as something worth having.



Small unrelated bug report: in your "Book review: Bernoulli's Fallacy" article, there is a link to: https://www.thatgeoguy.ca/blog/%7B%7B%20site.baseurl%20%7D%7...

I don't have much time to read through all this now but I'll check later, looks like great write-ups!


Thanks for the heads up! Seems my excerpt urls had some errors!


> It almost always is used as a shortcut to save on a data structure, and time and again I've seen it break workflows because it requires that the chain of items is first sorted.

I am not sure I understand. I almost never use transducers to create data structures. I use them as a way to create general processing steps. The standard example is how they are used in clojure's channels. In such a context you need both dedup and dedup-neighbors.

To be frank, I don't really care much for the *-transduce functions. I think a general purpose looping facility is a better choice almost always. For those things I use https://git.sr.ht/~bjoli/goof-loop which is always going to be faster than transducers unless you have very very smart compiler (or perhaps a tracing JIT).

I think that transducers should be integrated into the standard library to make sense so that you can for example pass them to a port constructor.

Anyway, your library looks much more complete, and pretty similar to the SRFI. The differences are mostly cosmetic.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: