Worker Pools¶

A worker pool implements a strategy for parallelizing a map() operation. That is, given a set of elements and a function taking a single input, a worker pool returns the result of evaluating that function on each input element in turn. In Glimpse, the function is usually a model’s BuildLayer method, and the elements are generally the model’s input state for different images.

When not using a compute cluster, the best worker pool to use is generally the one returned by MakePool(). For example

>>> pool = glimpse.pools.MakePool()
>>> pool.map(hex, [1, 2, 3])
['0x1', '0x2', '0x3']

Single-Host Pools¶

The most common parallelization scheme is the MulticorePool, which spreads evaluation of elements across multiple cores of a single host. Additionally, a fall-back scheme is provided by the SinglecorePool, which uses the builtin map() function. This can be useful for debugging, and when the complexity of multicore communication is unwanted.

Caution

Not all functions can be used in a parallel fashion. In case of mystifying errors, check the documentation for MulticorePool. Additionally, try using SinglecorePool to identify whether the error is due to parallelization.

Multi-Host Pools¶

Multi-host worker pools, or cluster pools, are more advanced than single-host pools, and require some additional configuration. These algorithms spread work across available cores on multiple machines connected over the network. The most stable cluster pool is the ipython cluster, which can be accessed as:

>>> pool = glimpse.pools.GetClusterPackage("ipython").MakePool(config_file)
>>> pool.map(hex, [1, 2, 3])
['0x1', '0x2', '0x3']

Note

When implementing a new cluster pool package, be sure to include the MakePool() function, which should return a new client connection.