Python, C++: Map, Reduce, Filter functions

 

Higher-order-functions

  • A higher-order function is a function that does at least one of the following:
    • takes one or more functions as arguments (i.e. a procedural parameter, which is a parameter of a procedure/function that is itself a procedure/function),
    • returns a function as its result.
  • All other functions are first-order functions.
  • In mathematics higher-order functions are also termed operators or functionals.
  • Examples:
    • The differential operator in calculus is a common example, since it maps a function to its derivative, which is also a function.
    • Map function, found in many functional programming languages, is one example of a higher-order function.
      • It takes as arguments a function f and a collection of elements,
      • and as the result, returns a new collection with f applied to each element from the collection.
    • Sorting functions, which take a comparison function as a parameter, allowing the programmer to separate the sorting algorithm from the comparisons of the items being sorted.
      • The C standard function qsort is an example of higher-order-functions.
    • Filter
    • Fold/Reduce
    • Callback
    • Tree Traversal
  • Python Example:
    >>> def twice(f):
    ...     def result(x):
    ...         return f(f(x))
    ...     return result
    
    >>> plus_three = lambda i: i + 3
    
    >>> g = twice(plus_three)
          
    >>> g(7)
    13
    
  • C++11 Example:
    #include <iostream>
    #include <functional>
    
    auto twice = [](const std::function<int(int)>& f)
    {
        return [&f](int x) {
            return f(f(x));
        };
    };
    
    auto plus_three = [](int i)
    {
        return i + 3;
    };
    
    int main()
    {
        auto g = twice(plus_three);
    
        std::cout << g(7) << '\n'; // 13
    }
    
  • With generic lambdas provided by C++14:
    #include <iostream>
    
    auto twice = [](const auto& f)
    {
        return [&f](int x) {
            return f(f(x));
        };
    };
    
    auto plus_three = [](int i)
    {
        return i + 3;
    };
    
    int main()
    {
        auto g = twice(plus_three);
    
        std::cout << g(7) << '\n'; // 13
    }
    

Map (higher-order-functions)

  • Map is a higher-order function that
    • applies a given function to each element of a functor, e.g. a list,
    • returns a list of results in the same order.
  • It is often called apply-to-all when considered in functional form.
  • Illustration:
    Map functionality

  • Note: This map function should not be confused with similarly-titled abstract data type composed of (key,value) pairs.

    Map operation in C++

    • Transform range: #include <algorithm>
      • Applies an operation/function sequentially to the elements of one or two ranges (~list/vector) and stores the result in the range/list that begins at result (~list/vector).
    • Map with 1 list/iterator:
      • std::transform(begin, end, result, func)
      • Applies func to each of the elements in the range [begin,end) and stores the value returned by each operation in the range that begins at result.
    • Map with 2 lists/iterators:
      • std::transform(begin1, end1, begin2, result, func)
      • Calls func using each of the elements in the range [begin1,end1) as first argument, and the respective argument in the range that begins at begin2 as second argument.
      • The value returned by each call is stored in the range that begins at result.

        Example:

Map operation in Python

  • Map with 1 list/iterator:
    • map(func, list)
  • Map with 2 lists/iterators:
    • map(func, list1, list2)
  • Map with n lists/iterators:
    • map(func, list1, list2, ...)
  • Returns: a list in Python2.
  • Returns: an iterator in Python3.
  • zip() and map() (Python3.x) stops after the shortest list ends.
  • whereas map() (Python2.x) and itertools.zip_longest() (Python3.x) extends the shorter lists with None items.

Built-in map() function

  • It return an iterator that applies function to every item of iterable, yielding the results.
  • If additional iterable arguments are passed, function must take that many arguments and function is applied to the items from all iterables in parallel.
  • With multiple iterables, the iterator stops when the shortest iterable is exhausted.
  • lambda Anonymous Function is usually used with map().
  • It’s not mandatory to use lambda function with map(), we can pass any user defined function also.
    Example:
      >>> List = [4,3,2,1]
      >>> SquaredList = list(map(lambda x: x**2, List))
      >>> 
      >>> print('List:', List, 'SquaredList:', SquaredList)
      ('List:', [4, 3, 2, 1], 'SquaredList:', [16, 9, 4, 1])
      >>> 
      >>> 
      >>> Tuple = 1,2,3,4
      >>> MultiplyListTupleElementally = list(map(lambda x,y: x*y, List, Tuple))
      >>> 
      >>> print('Multiply List & Tuple Elementally: ', MultiplyListTupleElementally, 'List: ', List, 'Tuple: ', Tuple)
      ('Multiply List & Tuple Elementally: ', [4, 6, 6, 4], 'List: ', [4, 3, 2, 1], 'Tuple: ', (1, 2, 3, 4))
    

itertools.starmap(function, iterable)

  • Make an iterator that computes the function using arguments obtained from the iterable.
  • Used instead of map() when argument parameters are already grouped in tuples from a single iterable (the data has been “pre-zipped”).
  • The difference between map() and starmap() parallels the distinction between function(a,b) and function(*c).
  • Roughly equivalent to:

      def starmap(function, iterable):
          # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
          for args in iterable:
              yield function(*args)
    

Parallel Map function: Pool.map()

  • In Python, one can create a pool of processes which will carry out tasks submitted to it with the Pool class.
  • A process pool object which controls a pool of worker processes to which jobs can be submitted. It supports asynchronous results with timeouts and callbacks and has a parallel map implementation.
  • This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks.
  • The (approximate) size of these chunks can be specified by setting chunksize to a positive integer.
  • Note that it may cause high memory usage for very long iterables. Consider using imap() or imap_unordered() with explicit chunksize option for better efficiency.
  • Parallel Processing: Map

Reduce

  • In functional programming, fold (also termed reduce, accumulate, aggregate, compress, or inject) refers to a family of higher-order functions that analyze a recursive data structure and through use of a given combining operation, recombine the results of recursively processing its constituent parts, building up a return value.

    Reduce/Accumulate function in C++

    • std::accumulate: [#include <numeric>]
      • Accumulate values in range
      • std::accumulate (InputIterator first, InputIterator last, T init, BinaryOperation binary_op);
      • Returns the result of accumulating all the values in the range [first,last) to init.
      • init: Initial value for the accumulator.
      • binary_op:
        • Binary operation taking an element of type T as first argument and an element in the range as second, and which returns a value that can be assigned to type T.
        • This can either be a function pointer or a function object.
        • The operation shall not modify the elements passed as its arguments.
        • The default operation is to add the elements up, but a different operation can be specified as binary_op.

          Example

Reduce/Accumulate function in Python

  • The functools module is for higher-order functions: functions that act on or return other functions.
  • functools.reduce(function, iterable[, initializer]):
    • Syntax:
      • functools.reduce(func, list, initval)
      • functools.reduce(lambda x,y: func(y,x), reversed(list), initval)
      • functools.reduce(func, list)
      • functools.reduce(lambda x,y: func(y,x), reversed(list))
    • Apply function of two arguments cumulatively to the items of iterable (~lists/vectors), from left to right, so as to reduce the iterable to a single value.
    • For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5).
    • The left argument, x, is the accumulated value and the right argument, y, is the update value from the iterable.
    • If the optional initializer is present, it is placed before the items of the iterable in the calculation, and serves as a default when the iterable is empty.
    • If initializer is not given and iterable contains only one item, the first item is returned.
    • functools.reduce() applies same function to items in a sequence.
    • Uses result of 1st operation as 1st parameter input to 2nd operation.
    • Returns an item, not a list.

Example

```python
  >>> from functools import reduce
  >>> List = [4,3,2,1]
  >>> allElementAddition = reduce(lambda x,y: x+y, List)
  >>> print('allElementAddition: ', allElementAddition, 'List: ', List)

  Output:
  >>> 'allElementAddition: ', 10, 'List: ', [4, 3, 2, 1]
```

Filter

  • In functional programming, filter is a higher-order function that processes a data structure (usually a list) in some order to produce a new data structure containing exactly those elements of the original data structure for which a given predicate (~filter operation) returns the boolean value true.
  • Illustration:
    filter-steps

    Filter operation in C++

    std::remove_copy_if

    • #include <algorithm>
    • std::remove_copy_if(begin, end, result, prednot)
    • Copy range removing values.
    • Copies the elements in the range [begin,end) to the range beginning at result, except those elements for which pred returns true.
    • The resulting range is shorter than [first,last) by as many elements as matches, which are “removed”.
    • pred:
      • Unary function that accepts an element in the range as argument, and returns a value convertible to bool.
      • The value returned indicates whether the element is to be removed from the copy (if true, it is not copied).
      • The function shall not modify its argument.
      • This can either be a function pointer or a function object.
    • result:
      • Output iterator to the initial position of the range where the resulting sequence is stored.
Example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// remove_copy_if example
#include <iostream>     // std::cout
#include <algorithm>    // std::remove_copy_if
#include <vector>       // std::vector

bool IsOdd (int i) { return ((i%2)==1); }

int main () {
  int myints[] = {1,2,3,4,5,6,7,8,9};
  std::vector<int> myvector (9);

  std::remove_copy_if (myints,myints+9,myvector.begin(),IsOdd);

  std::cout << "myvector contains:";
  for (std::vector<int>::iterator it=myvector.begin(); it!=myvector.end(); ++it)
    std::cout << ' ' << *it;
  std::cout << '\n';

  return 0;
}


Output:
  administrator@PTL011669:std_remove_copy_if$ ./std_remove_copy_if 
  myvector contains: 2 4 6 8 0 0 0 0 0
  administrator@PTL011669:std_remove_copy_if$ 

std::replace_copy_if

  • #include <algorithm>
  • OutputIterator replace_copy_if (InputIterator first, InputIterator last, OutputIterator result, UnaryPredicate pred, const T& new_value);
  • Copy range replacing value.
  • Copies the elements in the range [first,last) to the range beginning at result, replacing those for which pred returns true by new_value.
Example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
// replace_copy_if example
#include <iostream>     // std::cout
#include <algorithm>    // std::replace_copy_if
#include <vector>       // std::vector

bool IsOdd (int i) { return ((i%2)==1); }

int main () {
  std::vector<int> foo,bar;

  // set some values:
  for (int i=1; i<10; i++) foo.push_back(i);          // 1 2 3 4 5 6 7 8 9

  bar.resize(foo.size());   // allocate space
  std::replace_copy_if (foo.begin(), foo.end(), bar.begin(), IsOdd, 0);
                                                        // 0 2 0 4 0 6 0 8 0

  std::cout << "bar contains:";
  for (std::vector<int>::iterator it=bar.begin(); it!=bar.end(); ++it)
    std::cout << ' ' << *it;
  std::cout << '\n';

  return 0;
}


Output:
  administrator@PTL011669:std_remove_copy_if$ ./std_remove_copy_if 
  myvector contains: 2 4 6 8 0 0 0 0 0
  administrator@PTL011669:std_remove_copy_if$ 

std::copy_if

  • #include <algorithm>
  • std::copy_if(begin, end, result, pred) [C++11]
  • Copy certain elements of range.
  • Copies the elements in the range [begin,end) for which pred returns true to the range beginning at result.
  • pred
    • Unary function that accepts an element in the range as argument, and returns a value convertible to bool.
    • The value returned indicates whether the element is to be copied (if true, it is copied).
    • The function shall not modify any of its arguments.
    • This can either be a function pointer or a function object.
Example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// copy_if example
#include <iostream>     // std::cout
#include <algorithm>    // std::copy_if, std::distance
#include <vector>       // std::vector

int main () {
  std::vector<int> foo = {25,15,5,-5,-15};
  std::vector<int> bar (foo.size());

  // copy only positive numbers:
  auto it = std::copy_if (foo.begin(), foo.end(), bar.begin(), [](int i){return !(i<0);} );
  bar.resize(std::distance(bar.begin(),it));  // shrink container to new size

  std::cout << "bar contains:";
  for (int& x: bar) std::cout << ' ' << x;
  std::cout << '\n';

  return 0;
}

Filter function in Python

  • Syntax:
    • filter(<function>, iterable)
  • Construct an iterator from those elements of iterable for which function returns true.
  • Iterable may be either a sequence, a container which supports iteration, or an iterator.
  • If function is ‘None’, the identity function is assumed, that is, all elements of iterable that are false are removed.
  • See itertools.filterfalse() for the complementary function that returns elements of iterable for which function returns false.

Example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
>>> 
>>> 
>>> List = [0, 1,2,3,4,5]
>>> filteredList = list(filter(lambda x: x>2, List))
>>> print('Filtered List(>2): ', filteredList, 'List: ', List)
('Filtered List(>2): ', [3, 4, 5], 'List: ', [0, 1, 2, 3, 4, 5])
>>> 
>>> 
>>> print('--- <function> is None ----')
--- <function> is None ----
>>> defaultFilterList = list(filter(None, List))
>>> print('defaultFilterList: ', defaultFilterList, 'List: ', List)
('defaultFilterList: ', [1, 2, 3, 4, 5], 'List: ', [0, 1, 2, 3, 4, 5])
>>> 
>>> 
>>> numberList = [-4,-3,-2,0,1,3,5,7]
>>> filterNegativeNumbersLambdaFunc = lambda x: x>0
>>> positiveList = list(filter(filterNegativeNumbersLambdaFunc, numberList))
>>> print('Negative Numbers Filtered: ', positiveList, 'numberList: ', numberList)
('Negative Numbers Filtered: ', [1, 3, 5, 7], 'numberList: ', [-4, -3, -2, 0, 1, 3, 5, 7])
>>> 
>>> 
>>> numberList = [-4,-3,-2,0,1,3,5,7]
>>> def filterNegativeNumbersFunc(number):
...     newList = []
...     if number>0:
...         newList.append(number)
...     return newList
... 
>>> 
>>> positiveList = list(filter(filterNegativeNumbersFunc, numberList))
>>> print('Negative Numbers Filtered: ', positiveList, 'numberList: ', numberList)
('Negative Numbers Filtered: ', [1, 3, 5, 7], 'numberList: ', [-4, -3, -2, 0, 1, 3, 5, 7])
>>> 
>>> 
References: