Async Pipeline Pattern - Eager to work. I wrote a web crawler prototype and rediscovered the async pipeline pattern. It was interesting to see how this story explored the idea more thoroughly.
A comment from Lobste.rs showed a different view on how to potentially improve it:
One thing to keep in mind is that
FuturesUnordered
is really just an executor like the one use to spawn tokio tasks, which can introduce its own issues when used with tokio. I would use tokio’sJoinSet
in this context, rather thanFuturesUnordered
. In real code the image processing is presumably compute heavy and saving the image is presumably blocking file IO, so I’d actually use tokio’sspawn_blocking
to move both of those steps onto a background thread for compute work.Given that you know you are spawning exactly 3 tasks, you could instead use join or select here to concurrently perform each of those operations within a single task and avoid spawning. This will also make error handling and panics automatically flow upward without you having to remember to wait on the task handles. This does mean the work cannot be performed in parallel, though all of the real work would probably be on a
JoinSet
or background threads.Probably where I’d end up if I were refactoring this is really just two
JoinSets
(plus the channel coming into the task with URLs): I’d have one spawning async tasks for image download and one spawning blocking tasks for processing and saving images to disk. These would all be processed in a select loop, which could be made better IMO with a merge primitive that still doesn’t exist.
— withoutboats (link)
Finally, the animations were very useful in helping understand what the code was doing concretely. The author used similar visualizations extensively in another blog post.