Groovy parallel processing akka Gpars
- Code-level helpers – Constructs that can be applied to small parts of your code-base, such as an individual algorithms or data structure without any major changes in the overall project architecture
- Parallel Collections;
- Asynchronous Processing;
- Fork/Join (Divide/Conquer);
- Architecture/Design-level concepts – Constructs that need to be taken into account when designing the project structure
- Actors -> Scala inspired approach to organizing concurrent activities;
- Communicating Sequential Processes (CSP) -> formal language for describing patterns of interaction in concurrent systems;
- Dataflow -> dealing with live-locks, race-conditions and make deadlocks deterministic;
- Data Parallelism;
- Shared Mutable State Protection – More than 95% of the current use of shared mutable states can be avoided using proper abstractions. Good abstractions are still necessary for the remaining 5% of those use cases, i.e. when shared mutable state cannot be avoided;
- Agents -> it behaves like actors accepting code (functions) as messages.
When and who?
- You’re looking at a collection which needs to be iterated or processed using one of the many Groovy collection methods like each(), collect(), find(), etc. and supposing you want to process each element independently of the other items: parallel collections;
- If you have a long-lasting calculation which may safely run in the background: asynchronous invocation support. Asynchronous functions can be composed – you can quickly parallelize these complex functional calculations without having to flag independent calculations explicitly;
- When you need to parallelize an algorithm. You can identify a set of tasks with their mutual dependencies. The tasks typically do not need to share data, instead, some tasks may need to wait for other tasks to finish before starting: data flow tasks. You create internally sequential tasks, where each one run concurrently with the others;
- You can’t avoid using shared mutable state in your logic. Multiple threads will be accessing shared data and potentially modifying it. A traditional locking and synchronized approach feels too risky? Then go for Agents to wrap your data and serialize all access;
- You’re building a system with high concurrency demands. Tweaking a data structure here or task there won’t cut it. You need to build the architecture from the ground up with concurrency in mind. Message-passing could be the way to go. Your choices might include:
- Groovy CSP to give you highly deterministic and composable models for concurrent processes. A model is organized around the concept of calculations or processes which run concurrently and communicate through synchronous channels;
- If you’re trying to solve a complex data-processing problem consider:dataflow operators to build a data flow network. The concept is organized around event-driven transformations into pipelines using asynchronous channels;
- Actors and Active Objects will shine if you need to build a general-purpose, highly concurrent and scalable architecture following the object-oriented paradigm.
A little taste of:
Checking in parallel a set of sites whether they have specific words.
This content came up from my mind map study about Groovy and parallel processing. The idea here today is just an overview about Gpars and it’s possibilities.
My plan will be publish different posts for each category otherwise it would be an extensive content. So, coming soon:
- Data Parallelism (parallel collections, map/reduce, parallel array, asynch invocation, composable asynch functions, fork-join and parallel speculation);
- CSP (process, channels, compositions and alternatives);
- Data flow.