You are here

Using the Foreach Package in R: .combine, Iterators, and Other Functions


The R programming language can take advantage of a new looping construct that allows R code to be executed repeatedly in the foreach package. The prime attribute of foreach is its use in parallel programming, but it is also able to quickly execute a wide variety of complex operations.

.combine & foreach in R

In addition to using foreach to return lists of results, the results can also be returned in a numeric vector by using .combine. These results, when returned in a vector, can be also combined into a matrix. There are numerous other examples of how this function can be used to return data that are provided.

Iterators Using foreach in R

When using foreach, a user is able to create an iterator from numerous data sources. An iterator is an abstract source of data. The foreach function automatically creates an iterator from a vector, list, matrix, or data frame. In addition, the data might be gathered from a file or database query. An iterator can help data to be generated as needed rather than waiting for it to be generated at the beginning of a process, which is an extremely useful feature when working with large datasets.

Parallel Execution Using foreach in R

The prime utility of foreach is its ability to help users construct operations that can be conducted in parallel. Using this functionality makes little sense for those tasks that require a minute or less to execute. In those cases, executing sequential operations would actually be faster.

For running parallel threads, foreach doesn't require special handling for variables. They need only be defined once. If the function is defined in a package, for example, then the package must be loaded for it to be properly executed.

To make the apply function in R execute in parallel essentially requires only two for loops to be written. An example piece of code is provided and annotated.

List Comprehensions Using foreach in R

Foreach can also be used for sorting functions. It is very similar to list comprehensions in the Python programming language. It can, for example, filter an iterator's negative values. It can also be used to write a sort function. The possibility of using it for this type of operation is interesting, but the process is not recommended over R's sort function.

Conclusion

Using the foreach package significantly helps programmers at each step in the process of writing parallel computing operations in the R programming language. The three essential steps for writing code in parallel are addressed: splitting the problem into pieces, executing the pieces in parallel, and combining the results. Some simple code demonstrating each of these functions is provided in this paper, but more complex operations are possible. We will continue improving the package.