API Documentation

datasets

This module contains the DataSample class, MultiModalArray, MultiModalSparseArray, MultiModalSparseInfo and MultiModalData, class The DataSample class encapsulates a sample ‘s components nbL and nbEx numbers, MultiModalArray class inherit from numpy ndarray and contains a 2d data ndarray with the shape (n_samples, n_view_i * n_features_i)

0

1

2

3

xxxxxxxx

xxxx

xxxx

xxxx

xxxxxxxx

xxxx

xxxx

xxxx

xxxxxxxx

xxxx

xxxx

xxxx

xxxxxxxx

xxxx

xxxx

xxxx

xxxxxxxx

xxxx

xxxx

xxxx

xxxxxxxx

xxxx

xxxx

xxxx

xxxxxxxx

xxxx

xxxx

xxxx

xxxxxxxx

xxxx

xxxx

xxxx

xxxxxxxx

xxxx

xxxx

xxxx

MultiModalSparseArray inherit from scipy sparce matrix with the shape (n_samples, n_view_i * n_features_i)

class multimodal.datasets.data_sample.DataSample(data=None, **kwargs)

A DataSample instance

Example

>>> from multimodal.datasets.base import load_dict
>>> from multimodal.tests.datasets.get_dataset_path import get_dataset_path
>>> from multimodal.datasets.data_sample import DataSample
>>> file = 'input_x_dic.pkl'
>>> data = load_dict(get_dataset_path(file))
>>> print(data.__class__)
<class 'dict'>
>>> s = DataSample(data)
>>> type(s.data)
<class 'multimodal.datasets.data_sample.MultiModalArray'>
  • Input:

Parameters
datadict
kwargsothers arguments
Attributes
data{ array like} MultiModalArray

MultiModalArray

clear() None.  Remove all items from D.
copy() a shallow copy of D
property data

MultiModalArray

fromkeys(iterable, value=None, /)

Create a new dictionary with keys from iterable and values set to value.

get(key, default=None, /)

Return the value for key if key is in the dictionary, else default.

items() a set-like object providing a view on D's items
keys() a set-like object providing a view on D's keys
pop(key, default=<unrepresentable>, /)

If the key is not found, return the default if given; otherwise, raise a KeyError.

popitem(/)

Remove and return a (key, value) pair as a 2-tuple.

Pairs are returned in LIFO (last-in, first-out) order. Raises KeyError if the dict is empty.

setdefault(key, default=None, /)

Insert key with a value of default if key is not in the dictionary.

Return the value for key if key is in the dictionary, else default.

update([E, ]**F) None.  Update D from dict/iterable E and F.

If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

values() an object providing a view on D's values
class multimodal.datasets.data_sample.MultiModalArray(data, views_ind=None)

MultiModalArray inherit from numpy ndarray

Parameters
datacan be
  • dictionary of multiview array with shape = (n_samples, n_features) for multi-view

    for each view.

    {0: array([[]],

    1: array([[]], …}

  • numpy array like with shape = (n_samples, n_features) for multi-view

    for each view.

    [[[…]],

    [[…]], …]

  • {array like} with (n_samples, nviews * n_features) with ‘views_ind’ diferent to ‘None’

    for Multi-view input samples.

views_indarray-like (default= None ) if None

[0, n_features//2, n_features]) is constructed (2 views) Paramater specifying how to extract the data views from X:

  • views_ind is a 1-D array of sorted integers, the entries indicate the limits of the slices used to extract the views, where view n is given by X[:, views_ind[n]:views_ind[n+1]].

Attributes
views_indlist of views’ indice (may be None)
n_viewsint number of views
shapes_int: list of int numbers of feature for each views
:Example:
>>> from multimodal.datasets.base import load_dict
>>> from multimodal.tests.datasets.get_dataset_path import get_dataset_path
>>> from multimodal.datasets.data_sample import DataSample
>>> file = ‘input_x_dic.pkl’
>>> data = load_dict(get_dataset_path(file))
>>> print(data.__class__)
<class ‘dict’>
>>> multiviews = MultiModalArray(data)
>>> multiviews.shape
(120, 240)
>>> multiviews.shapes_int
[120, 120]
>>> multiviews.n_views
2
T

The transposed array.

Same as self.transpose().

See also

transpose

Examples

>>> x = np.array([[1.,2.],[3.,4.]])
>>> x
array([[ 1.,  2.],
       [ 3.,  4.]])
>>> x.T
array([[ 1.,  3.],
       [ 2.,  4.]])
>>> x = np.array([1.,2.,3.,4.])
>>> x
array([ 1.,  2.,  3.,  4.])
>>> x.T
array([ 1.,  2.,  3.,  4.])
all(axis=None, out=None, keepdims=False, *, where=True)

Returns True if all elements evaluate to True.

Refer to numpy.all for full documentation.

See also

numpy.all

equivalent function

any(axis=None, out=None, keepdims=False, *, where=True)

Returns True if any of the elements of a evaluate to True.

Refer to numpy.any for full documentation.

See also

numpy.any

equivalent function

argmax(axis=None, out=None)

Return indices of the maximum values along the given axis.

Refer to numpy.argmax for full documentation.

See also

numpy.argmax

equivalent function

argmin(axis=None, out=None)

Return indices of the minimum values along the given axis.

Refer to numpy.argmin for detailed documentation.

See also

numpy.argmin

equivalent function

argpartition(kth, axis=- 1, kind='introselect', order=None)

Returns the indices that would partition this array.

Refer to numpy.argpartition for full documentation.

New in version 1.8.0.

See also

numpy.argpartition

equivalent function

argsort(axis=- 1, kind=None, order=None)

Returns the indices that would sort this array.

Refer to numpy.argsort for full documentation.

See also

numpy.argsort

equivalent function

astype(dtype, order='K', casting='unsafe', subok=True, copy=True)

Copy of the array, cast to a specified type.

Parameters
dtypestr or dtype

Typecode or data-type to which the array is cast.

order{‘C’, ‘F’, ‘A’, ‘K’}, optional

Controls the memory layout order of the result. ‘C’ means C order, ‘F’ means Fortran order, ‘A’ means ‘F’ order if all the arrays are Fortran contiguous, ‘C’ order otherwise, and ‘K’ means as close to the order the array elements appear in memory as possible. Default is ‘K’.

casting{‘no’, ‘equiv’, ‘safe’, ‘same_kind’, ‘unsafe’}, optional

Controls what kind of data casting may occur. Defaults to ‘unsafe’ for backwards compatibility.

  • ‘no’ means the data types should not be cast at all.

  • ‘equiv’ means only byte-order changes are allowed.

  • ‘safe’ means only casts which can preserve values are allowed.

  • ‘same_kind’ means only safe casts or casts within a kind, like float64 to float32, are allowed.

  • ‘unsafe’ means any data conversions may be done.

subokbool, optional

If True, then sub-classes will be passed-through (default), otherwise the returned array will be forced to be a base-class array.

copybool, optional

By default, astype always returns a newly allocated array. If this is set to false, and the dtype, order, and subok requirements are satisfied, the input array is returned instead of a copy.

Returns
arr_tndarray

Unless copy is False and the other conditions for returning the input array are satisfied (see description for copy input parameter), arr_t is a new array of the same shape as the input array, with dtype, order given by dtype, order.

Raises
ComplexWarning

When casting from complex to float or int. To avoid this, one should use a.real.astype(t).

Notes

Changed in version 1.17.0: Casting between a simple data type and a structured one is possible only for “unsafe” casting. Casting to multiple fields is allowed, but casting from multiple fields is not.

Changed in version 1.9.0: Casting from numeric to string types in ‘safe’ casting mode requires that the string dtype length is long enough to store the max integer/float value converted.

Examples

>>> x = np.array([1, 2, 2.5])
>>> x
array([1. ,  2. ,  2.5])
>>> x.astype(int)
array([1, 2, 2])
base

Base object if memory is from some other object.

Examples

The base of an array that owns its memory is None:

>>> x = np.array([1,2,3,4])
>>> x.base is None
True

Slicing creates a view, whose memory is shared with x:

>>> y = x[2:]
>>> y.base is x
True
byteswap(inplace=False)

Swap the bytes of the array elements

Toggle between low-endian and big-endian data representation by returning a byteswapped array, optionally swapped in-place. Arrays of byte-strings are not swapped. The real and imaginary parts of a complex number are swapped individually.

Parameters
inplacebool, optional

If True, swap bytes in-place, default is False.

Returns
outndarray

The byteswapped array. If inplace is True, this is a view to self.

Examples

>>> A = np.array([1, 256, 8755], dtype=np.int16)
>>> list(map(hex, A))
['0x1', '0x100', '0x2233']
>>> A.byteswap(inplace=True)
array([  256,     1, 13090], dtype=int16)
>>> list(map(hex, A))
['0x100', '0x1', '0x3322']

Arrays of byte-strings are not swapped

>>> A = np.array([b'ceg', b'fac'])
>>> A.byteswap()
array([b'ceg', b'fac'], dtype='|S3')
A.newbyteorder().byteswap() produces an array with the same values

but different representation in memory

>>> A = np.array([1, 2, 3])
>>> A.view(np.uint8)
array([1, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0,
       0, 0], dtype=uint8)
>>> A.newbyteorder().byteswap(inplace=True)
array([1, 2, 3])
>>> A.view(np.uint8)
array([0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0,
       0, 3], dtype=uint8)
choose(choices, out=None, mode='raise')

Use an index array to construct a new array from a set of choices.

Refer to numpy.choose for full documentation.

See also

numpy.choose

equivalent function

clip(min=None, max=None, out=None, **kwargs)

Return an array whose values are limited to [min, max]. One of max or min must be given.

Refer to numpy.clip for full documentation.

See also

numpy.clip

equivalent function

compress(condition, axis=None, out=None)

Return selected slices of this array along given axis.

Refer to numpy.compress for full documentation.

See also

numpy.compress

equivalent function

conj()

Complex-conjugate all elements.

Refer to numpy.conjugate for full documentation.

See also

numpy.conjugate

equivalent function

conjugate()

Return the complex conjugate, element-wise.

Refer to numpy.conjugate for full documentation.

See also

numpy.conjugate

equivalent function

copy(order='C')

Return a copy of the array.

Parameters
order{‘C’, ‘F’, ‘A’, ‘K’}, optional

Controls the memory layout of the copy. ‘C’ means C-order, ‘F’ means F-order, ‘A’ means ‘F’ if a is Fortran contiguous, ‘C’ otherwise. ‘K’ means match the layout of a as closely as possible. (Note that this function and numpy.copy() are very similar but have different default values for their order= arguments, and this function always passes sub-classes through.)

See also

numpy.copy

Similar function with different default behavior

numpy.copyto

Notes

This function is the preferred method for creating an array copy. The function numpy.copy() is similar, but it defaults to using order ‘K’, and will not pass sub-classes through by default.

Examples

>>> x = np.array([[1,2,3],[4,5,6]], order='F')
>>> y = x.copy()
>>> x.fill(0)
>>> x
array([[0, 0, 0],
       [0, 0, 0]])
>>> y
array([[1, 2, 3],
       [4, 5, 6]])
>>> y.flags['C_CONTIGUOUS']
True
ctypes

An object to simplify the interaction of the array with the ctypes module.

This attribute creates an object that makes it easier to use arrays when calling shared libraries with the ctypes module. The returned object has, among others, data, shape, and strides attributes (see Notes below) which themselves return ctypes objects that can be used as arguments to a shared library.

Parameters
None
Returns
cPython object

Possessing attributes data, shape, strides, etc.

See also

numpy.ctypeslib

Notes

Below are the public attributes of this object which were documented in “Guide to NumPy” (we have omitted undocumented public attributes, as well as documented private attributes):

_ctypes.data

A pointer to the memory area of the array as a Python integer. This memory area may contain data that is not aligned, or not in correct byte-order. The memory area may not even be writeable. The array flags and data-type of this array should be respected when passing this attribute to arbitrary C-code to avoid trouble that can include Python crashing. User Beware! The value of this attribute is exactly the same as self._array_interface_['data'][0].

Note that unlike data_as, a reference will not be kept to the array: code like ctypes.c_void_p((a + b).ctypes.data) will result in a pointer to a deallocated array, and should be spelt (a + b).ctypes.data_as(ctypes.c_void_p)

_ctypes.shape

A ctypes array of length self.ndim where the basetype is the C-integer corresponding to dtype('p') on this platform. This base-type could be ctypes.c_int, ctypes.c_long, or ctypes.c_longlong depending on the platform. The c_intp type is defined accordingly in numpy.ctypeslib. The ctypes array contains the shape of the underlying array.

Type

(c_intp*self.ndim)

_ctypes.strides

A ctypes array of length self.ndim where the basetype is the same as for the shape attribute. This ctypes array contains the strides information from the underlying array. This strides information is important for showing how many bytes must be jumped to get to the next element in the array.

Type

(c_intp*self.ndim)

_ctypes.data_as(obj)

Return the data pointer cast to a particular c-types object. For example, calling self._as_parameter_ is equivalent to self.data_as(ctypes.c_void_p). Perhaps you want to use the data as a pointer to a ctypes array of floating-point data: self.data_as(ctypes.POINTER(ctypes.c_double)).

The returned pointer will keep a reference to the array.

_ctypes.shape_as(obj)

Return the shape tuple as an array of some other c-types type. For example: self.shape_as(ctypes.c_short).

_ctypes.strides_as(obj)

Return the strides tuple as an array of some other c-types type. For example: self.strides_as(ctypes.c_longlong).

If the ctypes module is not available, then the ctypes attribute of array objects still returns something useful, but ctypes objects are not returned and errors may be raised instead. In particular, the object will still have the as_parameter attribute which will return an integer equal to the data attribute.

Examples

>>> import ctypes
>>> x = np.array([[0, 1], [2, 3]], dtype=np.int32)
>>> x
array([[0, 1],
       [2, 3]], dtype=int32)
>>> x.ctypes.data
31962608 # may vary
>>> x.ctypes.data_as(ctypes.POINTER(ctypes.c_uint32))
<__main__.LP_c_uint object at 0x7ff2fc1fc200> # may vary
>>> x.ctypes.data_as(ctypes.POINTER(ctypes.c_uint32)).contents
c_uint(0)
>>> x.ctypes.data_as(ctypes.POINTER(ctypes.c_uint64)).contents
c_ulong(4294967296)
>>> x.ctypes.shape
<numpy.core._internal.c_long_Array_2 object at 0x7ff2fc1fce60> # may vary
>>> x.ctypes.strides
<numpy.core._internal.c_long_Array_2 object at 0x7ff2fc1ff320> # may vary
cumprod(axis=None, dtype=None, out=None)

Return the cumulative product of the elements along the given axis.

Refer to numpy.cumprod for full documentation.

See also

numpy.cumprod

equivalent function

cumsum(axis=None, dtype=None, out=None)

Return the cumulative sum of the elements along the given axis.

Refer to numpy.cumsum for full documentation.

See also

numpy.cumsum

equivalent function

data

Python buffer object pointing to the start of the array’s data.

diagonal(offset=0, axis1=0, axis2=1)

Return specified diagonals. In NumPy 1.9 the returned array is a read-only view instead of a copy as in previous NumPy versions. In a future version the read-only restriction will be removed.

Refer to numpy.diagonal() for full documentation.

See also

numpy.diagonal

equivalent function

dot(b, out=None)

Dot product of two arrays.

Refer to numpy.dot for full documentation.

See also

numpy.dot

equivalent function

Examples

>>> a = np.eye(2)
>>> b = np.ones((2, 2)) * 2
>>> a.dot(b)
array([[2.,  2.],
       [2.,  2.]])

This array method can be conveniently chained:

>>> a.dot(b).dot(b)
array([[8.,  8.],
       [8.,  8.]])
dtype

Data-type of the array’s elements.

Parameters
None
Returns
dnumpy dtype object

See also

numpy.dtype

Examples

>>> x
array([[0, 1],
       [2, 3]])
>>> x.dtype
dtype('int32')
>>> type(x.dtype)
<type 'numpy.dtype'>
dump(file)

Dump a pickle of the array to the specified file. The array can be read back with pickle.load or numpy.load.

Parameters
filestr or Path

A string naming the dump file.

Changed in version 1.17.0: pathlib.Path objects are now accepted.

dumps()

Returns the pickle of the array as a string. pickle.loads or numpy.loads will convert the string back to an array.

Parameters
None
fill(value)

Fill the array with a scalar value.

Parameters
valuescalar

All elements of a will be assigned this value.

Examples

>>> a = np.array([1, 2])
>>> a.fill(0)
>>> a
array([0, 0])
>>> a = np.empty(2)
>>> a.fill(1)
>>> a
array([1.,  1.])
flags

Information about the memory layout of the array.

Notes

The flags object can be accessed dictionary-like (as in a.flags['WRITEABLE']), or by using lowercased attribute names (as in a.flags.writeable). Short flag names are only supported in dictionary access.

Only the WRITEBACKIFCOPY, UPDATEIFCOPY, WRITEABLE, and ALIGNED flags can be changed by the user, via direct assignment to the attribute or dictionary entry, or by calling ndarray.setflags.

The array flags cannot be set arbitrarily:

  • UPDATEIFCOPY can only be set False.

  • WRITEBACKIFCOPY can only be set False.

  • ALIGNED can only be set True if the data is truly aligned.

  • WRITEABLE can only be set True if the array owns its own memory or the ultimate owner of the memory exposes a writeable buffer interface or is a string.

Arrays can be both C-style and Fortran-style contiguous simultaneously. This is clear for 1-dimensional arrays, but can also be true for higher dimensional arrays.

Even for contiguous arrays a stride for a given dimension arr.strides[dim] may be arbitrary if arr.shape[dim] == 1 or the array has no elements. It does not generally hold that self.strides[-1] == self.itemsize for C-style contiguous arrays or self.strides[0] == self.itemsize for Fortran-style contiguous arrays is true.

Attributes
C_CONTIGUOUS (C)

The data is in a single, C-style contiguous segment.

F_CONTIGUOUS (F)

The data is in a single, Fortran-style contiguous segment.

OWNDATA (O)

The array owns the memory it uses or borrows it from another object.

WRITEABLE (W)

The data area can be written to. Setting this to False locks the data, making it read-only. A view (slice, etc.) inherits WRITEABLE from its base array at creation time, but a view of a writeable array may be subsequently locked while the base array remains writeable. (The opposite is not true, in that a view of a locked array may not be made writeable. However, currently, locking a base object does not lock any views that already reference it, so under that circumstance it is possible to alter the contents of a locked array via a previously created writeable view onto it.) Attempting to change a non-writeable array raises a RuntimeError exception.

ALIGNED (A)

The data and all elements are aligned appropriately for the hardware.

WRITEBACKIFCOPY (X)

This array is a copy of some other array. The C-API function PyArray_ResolveWritebackIfCopy must be called before deallocating to the base array will be updated with the contents of this array.

UPDATEIFCOPY (U)

(Deprecated, use WRITEBACKIFCOPY) This array is a copy of some other array. When this array is deallocated, the base array will be updated with the contents of this array.

FNC

F_CONTIGUOUS and not C_CONTIGUOUS.

FORC

F_CONTIGUOUS or C_CONTIGUOUS (one-segment test).

BEHAVED (B)

ALIGNED and WRITEABLE.

CARRAY (CA)

BEHAVED and C_CONTIGUOUS.

FARRAY (FA)

BEHAVED and F_CONTIGUOUS and not C_CONTIGUOUS.

flat

A 1-D iterator over the array.

This is a numpy.flatiter instance, which acts similarly to, but is not a subclass of, Python’s built-in iterator object.

See also

flatten

Return a copy of the array collapsed into one dimension.

flatiter

Examples

>>> x = np.arange(1, 7).reshape(2, 3)
>>> x
array([[1, 2, 3],
       [4, 5, 6]])
>>> x.flat[3]
4
>>> x.T
array([[1, 4],
       [2, 5],
       [3, 6]])
>>> x.T.flat[3]
5
>>> type(x.flat)
<class 'numpy.flatiter'>

An assignment example:

>>> x.flat = 3; x
array([[3, 3, 3],
       [3, 3, 3]])
>>> x.flat[[1,4]] = 1; x
array([[3, 1, 3],
       [3, 1, 3]])
flatten(order='C')

Return a copy of the array collapsed into one dimension.

Parameters
order{‘C’, ‘F’, ‘A’, ‘K’}, optional

‘C’ means to flatten in row-major (C-style) order. ‘F’ means to flatten in column-major (Fortran- style) order. ‘A’ means to flatten in column-major order if a is Fortran contiguous in memory, row-major order otherwise. ‘K’ means to flatten a in the order the elements occur in memory. The default is ‘C’.

Returns
yndarray

A copy of the input array, flattened to one dimension.

See also

ravel

Return a flattened array.

flat

A 1-D flat iterator over the array.

Examples

>>> a = np.array([[1,2], [3,4]])
>>> a.flatten()
array([1, 2, 3, 4])
>>> a.flatten('F')
array([1, 3, 2, 4])
getfield(dtype, offset=0)

Returns a field of the given array as a certain type.

A field is a view of the array data with a given data-type. The values in the view are determined by the given type and the offset into the current array in bytes. The offset needs to be such that the view dtype fits in the array dtype; for example an array of dtype complex128 has 16-byte elements. If taking a view with a 32-bit integer (4 bytes), the offset needs to be between 0 and 12 bytes.

Parameters
dtypestr or dtype

The data type of the view. The dtype size of the view can not be larger than that of the array itself.

offsetint

Number of bytes to skip before beginning the element view.

Examples

>>> x = np.diag([1.+1.j]*2)
>>> x[1, 1] = 2 + 4.j
>>> x
array([[1.+1.j,  0.+0.j],
       [0.+0.j,  2.+4.j]])
>>> x.getfield(np.float64)
array([[1.,  0.],
       [0.,  2.]])

By choosing an offset of 8 bytes we can select the complex part of the array for our view:

>>> x.getfield(np.float64, offset=8)
array([[1.,  0.],
       [0.,  4.]])
imag

The imaginary part of the array.

Examples

>>> x = np.sqrt([1+0j, 0+1j])
>>> x.imag
array([ 0.        ,  0.70710678])
>>> x.imag.dtype
dtype('float64')
item(*args)

Copy an element of an array to a standard Python scalar and return it.

Parameters
*argsArguments (variable number and type)
  • none: in this case, the method only works for arrays with one element (a.size == 1), which element is copied into a standard Python scalar object and returned.

  • int_type: this argument is interpreted as a flat index into the array, specifying which element to copy and return.

  • tuple of int_types: functions as does a single int_type argument, except that the argument is interpreted as an nd-index into the array.

Returns
zStandard Python scalar object

A copy of the specified element of the array as a suitable Python scalar

Notes

When the data type of a is longdouble or clongdouble, item() returns a scalar array object because there is no available Python scalar that would not lose information. Void arrays return a buffer object for item(), unless fields are defined, in which case a tuple is returned.

item is very similar to a[args], except, instead of an array scalar, a standard Python scalar is returned. This can be useful for speeding up access to elements of the array and doing arithmetic on elements of the array using Python’s optimized math.

Examples

>>> np.random.seed(123)
>>> x = np.random.randint(9, size=(3, 3))
>>> x
array([[2, 2, 6],
       [1, 3, 6],
       [1, 0, 1]])
>>> x.item(3)
1
>>> x.item(7)
0
>>> x.item((0, 1))
2
>>> x.item((2, 2))
1
itemset(*args)

Insert scalar into an array (scalar is cast to array’s dtype, if possible)

There must be at least 1 argument, and define the last argument as item. Then, a.itemset(*args) is equivalent to but faster than a[args] = item. The item should be a scalar value and args must select a single item in the array a.

Parameters
*argsArguments

If one argument: a scalar, only used in case a is of size 1. If two arguments: the last argument is the value to be set and must be a scalar, the first argument specifies a single array element location. It is either an int or a tuple.

Notes

Compared to indexing syntax, itemset provides some speed increase for placing a scalar into a particular location in an ndarray, if you must do this. However, generally this is discouraged: among other problems, it complicates the appearance of the code. Also, when using itemset (and item) inside a loop, be sure to assign the methods to a local variable to avoid the attribute look-up at each loop iteration.

Examples

>>> np.random.seed(123)
>>> x = np.random.randint(9, size=(3, 3))
>>> x
array([[2, 2, 6],
       [1, 3, 6],
       [1, 0, 1]])
>>> x.itemset(4, 0)
>>> x.itemset((2, 2), 9)
>>> x
array([[2, 2, 6],
       [1, 0, 6],
       [1, 0, 9]])
itemsize

Length of one array element in bytes.

Examples

>>> x = np.array([1,2,3], dtype=np.float64)
>>> x.itemsize
8
>>> x = np.array([1,2,3], dtype=np.complex128)
>>> x.itemsize
16
max(axis=None, out=None, keepdims=False, initial=<no value>, where=True)

Return the maximum along a given axis.

Refer to numpy.amax for full documentation.

See also

numpy.amax

equivalent function

mean(axis=None, dtype=None, out=None, keepdims=False, *, where=True)

Returns the average of the array elements along given axis.

Refer to numpy.mean for full documentation.

See also

numpy.mean

equivalent function

min(axis=None, out=None, keepdims=False, initial=<no value>, where=True)

Return the minimum along a given axis.

Refer to numpy.amin for full documentation.

See also

numpy.amin

equivalent function

nbytes

Total bytes consumed by the elements of the array.

Notes

Does not include memory consumed by non-element attributes of the array object.

Examples

>>> x = np.zeros((3,5,2), dtype=np.complex128)
>>> x.nbytes
480
>>> np.prod(x.shape) * x.itemsize
480
ndim

Number of array dimensions.

Examples

>>> x = np.array([1, 2, 3])
>>> x.ndim
1
>>> y = np.zeros((2, 3, 4))
>>> y.ndim
3
newbyteorder(new_order='S', /)

Return the array with the same data viewed with a different byte order.

Equivalent to:

arr.view(arr.dtype.newbytorder(new_order))

Changes are also made in all fields and sub-arrays of the array data type.

Parameters
new_orderstring, optional

Byte order to force; a value from the byte order specifications below. new_order codes can be any of:

  • ‘S’ - swap dtype from current to opposite endian

  • {‘<’, ‘little’} - little endian

  • {‘>’, ‘big’} - big endian

  • ‘=’ - native order, equivalent to sys.byteorder

  • {‘|’, ‘I’} - ignore (no change to byte order)

The default value (‘S’) results in swapping the current byte order.

Returns
new_arrarray

New array object with the dtype reflecting given change to the byte order.

nonzero()

Return the indices of the elements that are non-zero.

Refer to numpy.nonzero for full documentation.

See also

numpy.nonzero

equivalent function

partition(kth, axis=- 1, kind='introselect', order=None)

Rearranges the elements in the array in such a way that the value of the element in kth position is in the position it would be in a sorted array. All elements smaller than the kth element are moved before this element and all equal or greater are moved behind it. The ordering of the elements in the two partitions is undefined.

New in version 1.8.0.

Parameters
kthint or sequence of ints

Element index to partition by. The kth element value will be in its final sorted position and all smaller elements will be moved before it and all equal or greater elements behind it. The order of all elements in the partitions is undefined. If provided with a sequence of kth it will partition all elements indexed by kth of them into their sorted position at once.

axisint, optional

Axis along which to sort. Default is -1, which means sort along the last axis.

kind{‘introselect’}, optional

Selection algorithm. Default is ‘introselect’.

orderstr or list of str, optional

When a is an array with fields defined, this argument specifies which fields to compare first, second, etc. A single field can be specified as a string, and not all fields need to be specified, but unspecified fields will still be used, in the order in which they come up in the dtype, to break ties.

See also

numpy.partition

Return a parititioned copy of an array.

argpartition

Indirect partition.

sort

Full sort.

Notes

See np.partition for notes on the different algorithms.

Examples

>>> a = np.array([3, 4, 2, 1])
>>> a.partition(3)
>>> a
array([2, 1, 3, 4])
>>> a.partition((1, 3))
>>> a
array([1, 2, 3, 4])
prod(axis=None, dtype=None, out=None, keepdims=False, initial=1, where=True)

Return the product of the array elements over the given axis

Refer to numpy.prod for full documentation.

See also

numpy.prod

equivalent function

ptp(axis=None, out=None, keepdims=False)

Peak to peak (maximum - minimum) value along a given axis.

Refer to numpy.ptp for full documentation.

See also

numpy.ptp

equivalent function

put(indices, values, mode='raise')

Set a.flat[n] = values[n] for all n in indices.

Refer to numpy.put for full documentation.

See also

numpy.put

equivalent function

ravel([order])

Return a flattened array.

Refer to numpy.ravel for full documentation.

See also

numpy.ravel

equivalent function

ndarray.flat

a flat iterator on the array.

real

The real part of the array.

See also

numpy.real

equivalent function

Examples

>>> x = np.sqrt([1+0j, 0+1j])
>>> x.real
array([ 1.        ,  0.70710678])
>>> x.real.dtype
dtype('float64')
repeat(repeats, axis=None)

Repeat elements of an array.

Refer to numpy.repeat for full documentation.

See also

numpy.repeat

equivalent function

reshape(shape, order='C')

Returns an array containing the same data with a new shape.

Refer to numpy.reshape for full documentation.

See also

numpy.reshape

equivalent function

Notes

Unlike the free function numpy.reshape, this method on ndarray allows the elements of the shape parameter to be passed in as separate arguments. For example, a.reshape(10, 11) is equivalent to a.reshape((10, 11)).

resize(new_shape, refcheck=True)

Change shape and size of array in-place.

Parameters
new_shapetuple of ints, or n ints

Shape of resized array.

refcheckbool, optional

If False, reference count will not be checked. Default is True.

Returns
None
Raises
ValueError

If a does not own its own data or references or views to it exist, and the data memory must be changed. PyPy only: will always raise if the data memory must be changed, since there is no reliable way to determine if references or views to it exist.

SystemError

If the order keyword argument is specified. This behaviour is a bug in NumPy.

See also

resize

Return a new array with the specified shape.

Notes

This reallocates space for the data area if necessary.

Only contiguous arrays (data elements consecutive in memory) can be resized.

The purpose of the reference count check is to make sure you do not use this array as a buffer for another Python object and then reallocate the memory. However, reference counts can increase in other ways so if you are sure that you have not shared the memory for this array with another Python object, then you may safely set refcheck to False.

Examples

Shrinking an array: array is flattened (in the order that the data are stored in memory), resized, and reshaped:

>>> a = np.array([[0, 1], [2, 3]], order='C')
>>> a.resize((2, 1))
>>> a
array([[0],
       [1]])
>>> a = np.array([[0, 1], [2, 3]], order='F')
>>> a.resize((2, 1))
>>> a
array([[0],
       [2]])

Enlarging an array: as above, but missing entries are filled with zeros:

>>> b = np.array([[0, 1], [2, 3]])
>>> b.resize(2, 3) # new_shape parameter doesn't have to be a tuple
>>> b
array([[0, 1, 2],
       [3, 0, 0]])

Referencing an array prevents resizing…

>>> c = a
>>> a.resize((1, 1))
Traceback (most recent call last):
...
ValueError: cannot resize an array that references or is referenced ...

Unless refcheck is False:

>>> a.resize((1, 1), refcheck=False)
>>> a
array([[0]])
>>> c
array([[0]])
round(decimals=0, out=None)

Return a with each element rounded to the given number of decimals.

Refer to numpy.around for full documentation.

See also

numpy.around

equivalent function

searchsorted(v, side='left', sorter=None)

Find indices where elements of v should be inserted in a to maintain order.

For full documentation, see numpy.searchsorted

See also

numpy.searchsorted

equivalent function

setfield(val, dtype, offset=0)

Put a value into a specified place in a field defined by a data-type.

Place val into a’s field defined by dtype and beginning offset bytes into the field.

Parameters
valobject

Value to be placed in field.

dtypedtype object

Data-type of the field in which to place val.

offsetint, optional

The number of bytes into the field at which to place val.

Returns
None

See also

getfield

Examples

>>> x = np.eye(3)
>>> x.getfield(np.float64)
array([[1.,  0.,  0.],
       [0.,  1.,  0.],
       [0.,  0.,  1.]])
>>> x.setfield(3, np.int32)
>>> x.getfield(np.int32)
array([[3, 3, 3],
       [3, 3, 3],
       [3, 3, 3]], dtype=int32)
>>> x
array([[1.0e+000, 1.5e-323, 1.5e-323],
       [1.5e-323, 1.0e+000, 1.5e-323],
       [1.5e-323, 1.5e-323, 1.0e+000]])
>>> x.setfield(np.eye(3), np.int32)
>>> x
array([[1.,  0.,  0.],
       [0.,  1.,  0.],
       [0.,  0.,  1.]])
setflags(write=None, align=None, uic=None)

Set array flags WRITEABLE, ALIGNED, (WRITEBACKIFCOPY and UPDATEIFCOPY), respectively.

These Boolean-valued flags affect how numpy interprets the memory area used by a (see Notes below). The ALIGNED flag can only be set to True if the data is actually aligned according to the type. The WRITEBACKIFCOPY and (deprecated) UPDATEIFCOPY flags can never be set to True. The flag WRITEABLE can only be set to True if the array owns its own memory, or the ultimate owner of the memory exposes a writeable buffer interface, or is a string. (The exception for string is made so that unpickling can be done without copying memory.)

Parameters
writebool, optional

Describes whether or not a can be written to.

alignbool, optional

Describes whether or not a is aligned properly for its type.

uicbool, optional

Describes whether or not a is a copy of another “base” array.

Notes

Array flags provide information about how the memory area used for the array is to be interpreted. There are 7 Boolean flags in use, only four of which can be changed by the user: WRITEBACKIFCOPY, UPDATEIFCOPY, WRITEABLE, and ALIGNED.

WRITEABLE (W) the data area can be written to;

ALIGNED (A) the data and strides are aligned appropriately for the hardware (as determined by the compiler);

UPDATEIFCOPY (U) (deprecated), replaced by WRITEBACKIFCOPY;

WRITEBACKIFCOPY (X) this array is a copy of some other array (referenced by .base). When the C-API function PyArray_ResolveWritebackIfCopy is called, the base array will be updated with the contents of this array.

All flags can be accessed using the single (upper case) letter as well as the full name.

Examples

>>> y = np.array([[3, 1, 7],
...               [2, 0, 0],
...               [8, 5, 9]])
>>> y
array([[3, 1, 7],
       [2, 0, 0],
       [8, 5, 9]])
>>> y.flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False
>>> y.setflags(write=0, align=0)
>>> y.flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : False
  ALIGNED : False
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False
>>> y.setflags(uic=1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: cannot set WRITEBACKIFCOPY flag to True
shape

Tuple of array dimensions.

The shape property is usually used to get the current shape of an array, but may also be used to reshape the array in-place by assigning a tuple of array dimensions to it. As with numpy.reshape, one of the new shape dimensions can be -1, in which case its value is inferred from the size of the array and the remaining dimensions. Reshaping an array in-place will fail if a copy is required.

See also

numpy.reshape

similar function

ndarray.reshape

similar method

Examples

>>> x = np.array([1, 2, 3, 4])
>>> x.shape
(4,)
>>> y = np.zeros((2, 3, 4))
>>> y.shape
(2, 3, 4)
>>> y.shape = (3, 8)
>>> y
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])
>>> y.shape = (3, 6)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: total size of new array must be unchanged
>>> np.zeros((4,2))[::2].shape = (-1,)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: Incompatible shape for in-place modification. Use
`.reshape()` to make a copy with the desired shape.
size

Number of elements in the array.

Equal to np.prod(a.shape), i.e., the product of the array’s dimensions.

Notes

a.size returns a standard arbitrary precision Python integer. This may not be the case with other methods of obtaining the same value (like the suggested np.prod(a.shape), which returns an instance of np.int_), and may be relevant if the value is used further in calculations that may overflow a fixed size integer type.

Examples

>>> x = np.zeros((3, 5, 2), dtype=np.complex128)
>>> x.size
30
>>> np.prod(x.shape)
30
sort(axis=- 1, kind=None, order=None)

Sort an array in-place. Refer to numpy.sort for full documentation.

Parameters
axisint, optional

Axis along which to sort. Default is -1, which means sort along the last axis.

kind{‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}, optional

Sorting algorithm. The default is ‘quicksort’. Note that both ‘stable’ and ‘mergesort’ use timsort under the covers and, in general, the actual implementation will vary with datatype. The ‘mergesort’ option is retained for backwards compatibility.

Changed in version 1.15.0: The ‘stable’ option was added.

orderstr or list of str, optional

When a is an array with fields defined, this argument specifies which fields to compare first, second, etc. A single field can be specified as a string, and not all fields need be specified, but unspecified fields will still be used, in the order in which they come up in the dtype, to break ties.

See also

numpy.sort

Return a sorted copy of an array.

numpy.argsort

Indirect sort.

numpy.lexsort

Indirect stable sort on multiple keys.

numpy.searchsorted

Find elements in sorted array.

numpy.partition

Partial sort.

Notes

See numpy.sort for notes on the different sorting algorithms.

Examples

>>> a = np.array([[1,4], [3,1]])
>>> a.sort(axis=1)
>>> a
array([[1, 4],
       [1, 3]])
>>> a.sort(axis=0)
>>> a
array([[1, 3],
       [1, 4]])

Use the order keyword to specify a field to use when sorting a structured array:

>>> a = np.array([('a', 2), ('c', 1)], dtype=[('x', 'S1'), ('y', int)])
>>> a.sort(order='y')
>>> a
array([(b'c', 1), (b'a', 2)],
      dtype=[('x', 'S1'), ('y', '<i8')])
squeeze(axis=None)

Remove axes of length one from a.

Refer to numpy.squeeze for full documentation.

See also

numpy.squeeze

equivalent function

std(axis=None, dtype=None, out=None, ddof=0, keepdims=False, *, where=True)

Returns the standard deviation of the array elements along given axis.

Refer to numpy.std for full documentation.

See also

numpy.std

equivalent function

strides

Tuple of bytes to step in each dimension when traversing an array.

The byte offset of element (i[0], i[1], ..., i[n]) in an array a is:

offset = sum(np.array(i) * a.strides)

A more detailed explanation of strides can be found in the “ndarray.rst” file in the NumPy reference guide.

See also

numpy.lib.stride_tricks.as_strided

Notes

Imagine an array of 32-bit integers (each 4 bytes):

x = np.array([[0, 1, 2, 3, 4],
              [5, 6, 7, 8, 9]], dtype=np.int32)

This array is stored in memory as 40 bytes, one after the other (known as a contiguous block of memory). The strides of an array tell us how many bytes we have to skip in memory to move to the next position along a certain axis. For example, we have to skip 4 bytes (1 value) to move to the next column, but 20 bytes (5 values) to get to the same position in the next row. As such, the strides for the array x will be (20, 4).

Examples

>>> y = np.reshape(np.arange(2*3*4), (2,3,4))
>>> y
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],
       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])
>>> y.strides
(48, 16, 4)
>>> y[1,1,1]
17
>>> offset=sum(y.strides * np.array((1,1,1)))
>>> offset/y.itemsize
17
>>> x = np.reshape(np.arange(5*6*7*8), (5,6,7,8)).transpose(2,3,1,0)
>>> x.strides
(32, 4, 224, 1344)
>>> i = np.array([3,5,2,2])
>>> offset = sum(i * x.strides)
>>> x[3,5,2,2]
813
>>> offset / x.itemsize
813
sum(axis=None, dtype=None, out=None, keepdims=False, initial=0, where=True)

Return the sum of the array elements over the given axis.

Refer to numpy.sum for full documentation.

See also

numpy.sum

equivalent function

swapaxes(axis1, axis2)

Return a view of the array with axis1 and axis2 interchanged.

Refer to numpy.swapaxes for full documentation.

See also

numpy.swapaxes

equivalent function

take(indices, axis=None, out=None, mode='raise')

Return an array formed from the elements of a at the given indices.

Refer to numpy.take for full documentation.

See also

numpy.take

equivalent function

tobytes(order='C')

Construct Python bytes containing the raw data bytes in the array.

Constructs Python bytes showing a copy of the raw contents of data memory. The bytes object is produced in C-order by default. This behavior is controlled by the order parameter.

New in version 1.9.0.

Parameters
order{‘C’, ‘F’, ‘A’}, optional

Controls the memory layout of the bytes object. ‘C’ means C-order, ‘F’ means F-order, ‘A’ (short for Any) means ‘F’ if a is Fortran contiguous, ‘C’ otherwise. Default is ‘C’.

Returns
sbytes

Python bytes exhibiting a copy of a’s raw data.

Examples

>>> x = np.array([[0, 1], [2, 3]], dtype='<u2')
>>> x.tobytes()
b'\x00\x00\x01\x00\x02\x00\x03\x00'
>>> x.tobytes('C') == x.tobytes()
True
>>> x.tobytes('F')
b'\x00\x00\x02\x00\x01\x00\x03\x00'
tofile(fid, sep='', format='%s')

Write array to a file as text or binary (default).

Data is always written in ‘C’ order, independent of the order of a. The data produced by this method can be recovered using the function fromfile().

Parameters
fidfile or str or Path

An open file object, or a string containing a filename.

Changed in version 1.17.0: pathlib.Path objects are now accepted.

sepstr

Separator between array items for text output. If “” (empty), a binary file is written, equivalent to file.write(a.tobytes()).

formatstr

Format string for text file output. Each entry in the array is formatted to text by first converting it to the closest Python type, and then using “format” % item.

Notes

This is a convenience function for quick storage of array data. Information on endianness and precision is lost, so this method is not a good choice for files intended to archive data or transport data between machines with different endianness. Some of these problems can be overcome by outputting the data as text files, at the expense of speed and file size.

When fid is a file object, array contents are directly written to the file, bypassing the file object’s write method. As a result, tofile cannot be used with files objects supporting compression (e.g., GzipFile) or file-like objects that do not support fileno() (e.g., BytesIO).

tolist()

Return the array as an a.ndim-levels deep nested list of Python scalars.

Return a copy of the array data as a (nested) Python list. Data items are converted to the nearest compatible builtin Python type, via the ~numpy.ndarray.item function.

If a.ndim is 0, then since the depth of the nested list is 0, it will not be a list at all, but a simple Python scalar.

Parameters
none
Returns
yobject, or list of object, or list of list of object, or …

The possibly nested list of array elements.

Notes

The array may be recreated via a = np.array(a.tolist()), although this may sometimes lose precision.

Examples

For a 1D array, a.tolist() is almost the same as list(a), except that tolist changes numpy scalars to Python scalars:

>>> a = np.uint32([1, 2])
>>> a_list = list(a)
>>> a_list
[1, 2]
>>> type(a_list[0])
<class 'numpy.uint32'>
>>> a_tolist = a.tolist()
>>> a_tolist
[1, 2]
>>> type(a_tolist[0])
<class 'int'>

Additionally, for a 2D array, tolist applies recursively:

>>> a = np.array([[1, 2], [3, 4]])
>>> list(a)
[array([1, 2]), array([3, 4])]
>>> a.tolist()
[[1, 2], [3, 4]]

The base case for this recursion is a 0D array:

>>> a = np.array(1)
>>> list(a)
Traceback (most recent call last):
  ...
TypeError: iteration over a 0-d array
>>> a.tolist()
1
tostring(order='C')

A compatibility alias for tobytes, with exactly the same behavior.

Despite its name, it returns bytes not strs.

Deprecated since version 1.19.0.

trace(offset=0, axis1=0, axis2=1, dtype=None, out=None)

Return the sum along diagonals of the array.

Refer to numpy.trace for full documentation.

See also

numpy.trace

equivalent function

transpose(*axes)

Returns a view of the array with axes transposed.

For a 1-D array this has no effect, as a transposed vector is simply the same vector. To convert a 1-D array into a 2D column vector, an additional dimension must be added. np.atleast2d(a).T achieves this, as does a[:, np.newaxis]. For a 2-D array, this is a standard matrix transpose. For an n-D array, if axes are given, their order indicates how the axes are permuted (see Examples). If axes are not provided and a.shape = (i[0], i[1], ... i[n-2], i[n-1]), then a.transpose().shape = (i[n-1], i[n-2], ... i[1], i[0]).

Parameters
axesNone, tuple of ints, or n ints
  • None or no argument: reverses the order of the axes.

  • tuple of ints: i in the j-th place in the tuple means a’s i-th axis becomes a.transpose()’s j-th axis.

  • n ints: same as an n-tuple of the same ints (this form is intended simply as a “convenience” alternative to the tuple form)

Returns
outndarray

View of a, with axes suitably permuted.

See also

transpose

Equivalent function

ndarray.T

Array property returning the array transposed.

ndarray.reshape

Give a new shape to an array without changing its data.

Examples

>>> a = np.array([[1, 2], [3, 4]])
>>> a
array([[1, 2],
       [3, 4]])
>>> a.transpose()
array([[1, 3],
       [2, 4]])
>>> a.transpose((1, 0))
array([[1, 3],
       [2, 4]])
>>> a.transpose(1, 0)
array([[1, 3],
       [2, 4]])
var(axis=None, dtype=None, out=None, ddof=0, keepdims=False, *, where=True)

Returns the variance of the array elements, along given axis.

Refer to numpy.var for full documentation.

See also

numpy.var

equivalent function

view([dtype][, type])

New view of array with the same data.

Note

Passing None for dtype is different from omitting the parameter, since the former invokes dtype(None) which is an alias for dtype('float_').

Parameters
dtypedata-type or ndarray sub-class, optional

Data-type descriptor of the returned view, e.g., float32 or int16. Omitting it results in the view having the same data-type as a. This argument can also be specified as an ndarray sub-class, which then specifies the type of the returned object (this is equivalent to setting the type parameter).

typePython type, optional

Type of the returned view, e.g., ndarray or matrix. Again, omission of the parameter results in type preservation.

Notes

a.view() is used two different ways:

a.view(some_dtype) or a.view(dtype=some_dtype) constructs a view of the array’s memory with a different data-type. This can cause a reinterpretation of the bytes of memory.

a.view(ndarray_subclass) or a.view(type=ndarray_subclass) just returns an instance of ndarray_subclass that looks at the same array (same shape, dtype, etc.) This does not cause a reinterpretation of the memory.

For a.view(some_dtype), if some_dtype has a different number of bytes per entry than the previous dtype (for example, converting a regular array to a structured array), then the behavior of the view cannot be predicted just from the superficial appearance of a (shown by print(a)). It also depends on exactly how a is stored in memory. Therefore if a is C-ordered versus fortran-ordered, versus defined as a slice or transpose, etc., the view may give different results.

Examples

>>> x = np.array([(1, 2)], dtype=[('a', np.int8), ('b', np.int8)])

Viewing array data using a different type and dtype:

>>> y = x.view(dtype=np.int16, type=np.matrix)
>>> y
matrix([[513]], dtype=int16)
>>> print(type(y))
<class 'numpy.matrix'>

Creating a view on a structured array so it can be used in calculations

>>> x = np.array([(1, 2),(3,4)], dtype=[('a', np.int8), ('b', np.int8)])
>>> xv = x.view(dtype=np.int8).reshape(-1,2)
>>> xv
array([[1, 2],
       [3, 4]], dtype=int8)
>>> xv.mean(0)
array([2.,  3.])

Making changes to the view changes the underlying array

>>> xv[0,1] = 20
>>> x
array([(1, 20), (3,  4)], dtype=[('a', 'i1'), ('b', 'i1')])

Using a view to convert an array to a recarray:

>>> z = x.view(np.recarray)
>>> z.a
array([1, 3], dtype=int8)

Views share data:

>>> x[0] = (9, 10)
>>> z[0]
(9, 10)

Views that change the dtype size (bytes per entry) should normally be avoided on arrays defined by slices, transposes, fortran-ordering, etc.:

>>> x = np.array([[1,2,3],[4,5,6]], dtype=np.int16)
>>> y = x[:, 0:2]
>>> y
array([[1, 2],
       [4, 5]], dtype=int16)
>>> y.view(dtype=[('width', np.int16), ('length', np.int16)])
Traceback (most recent call last):
    ...
ValueError: To change to a dtype of a different size, the array must be C-contiguous
>>> z = y.copy()
>>> z.view(dtype=[('width', np.int16), ('length', np.int16)])
array([[(1, 2)],
       [(4, 5)]], dtype=[('width', '<i2'), ('length', '<i2')])
class multimodal.datasets.data_sample.MultiModalSparseArray(*arg, **kwargs)

MultiModalArray inherit from numpy ndarray

Parameters
datacan be
  • dictionary of multiview array with shape = (n_samples, n_features) for multi-view

    for each view.

    {0: array([[]],

    1: array([[]], …}

  • numpy array like with shape = (n_samples, n_features) for multi-view

    for each view.

    [[[…]],

    [[…]], …]

  • {array like} with (n_samples, nviews * n_features) with ‘views_ind’ diferent to ‘None’

    for Multi-view input samples.

views_indarray-like (default= None ) if None

[0, n_features//2, n_features]) is constructed (2 views) Paramater specifying how to extract the data views from X:

  • views_ind is a 1-D array of sorted integers, the entries indicate the limits of the slices used to extract the views, where view n is given by X[:, views_ind[n]:views_ind[n+1]].

Attributes
views_indlist of views’ indice (may be None)

n_views : int number of views

shapes_int: list of int numbers of feature for each views

keys : name of key, where data come from a dictionary

:Example:
>>> from multimodal.datasets.base import load_dict
>>> from multimodal.tests.datasets.get_dataset_path import get_dataset_path
>>> from multimodal.datasets.data_sample import DataSample
>>> file = ‘input_x_dic.pkl’
>>> data = load_dict(get_dataset_path(file))
arcsin()

Element-wise arcsin.

See numpy.arcsin for more information.

arcsinh()

Element-wise arcsinh.

See numpy.arcsinh for more information.

arctan()

Element-wise arctan.

See numpy.arctan for more information.

arctanh()

Element-wise arctanh.

See numpy.arctanh for more information.

argmax(axis=None, out=None)

Return indices of maximum elements along an axis.

Implicit zero elements are also taken into account. If there are several maximum values, the index of the first occurrence is returned.

Parameters
axis{-2, -1, 0, 1, None}, optional

Axis along which the argmax is computed. If None (default), index of the maximum element in the flatten data is returned.

outNone, optional

This argument is in the signature solely for NumPy compatibility reasons. Do not pass in anything except for the default value, as this argument is not used.

Returns
indnumpy.matrix or int

Indices of maximum elements. If matrix, its size along axis is 1.

argmin(axis=None, out=None)

Return indices of minimum elements along an axis.

Implicit zero elements are also taken into account. If there are several minimum values, the index of the first occurrence is returned.

Parameters
axis{-2, -1, 0, 1, None}, optional

Axis along which the argmin is computed. If None (default), index of the minimum element in the flatten data is returned.

outNone, optional

This argument is in the signature solely for NumPy compatibility reasons. Do not pass in anything except for the default value, as this argument is not used.

Returns
indnumpy.matrix or int

Indices of minimum elements. If matrix, its size along axis is 1.

asformat(format, copy=False)

Return this matrix in the passed format.

Parameters
format{str, None}

The desired matrix format (“csr”, “csc”, “lil”, “dok”, “array”, …) or None for no conversion.

copybool, optional

If True, the result is guaranteed to not share data with self.

Returns
AThis matrix in the passed format.
asfptype()

Upcast matrix to a floating point format (if necessary)

astype(dtype, casting='unsafe', copy=True)

Cast the matrix elements to a specified type.

Parameters
dtypestring or numpy dtype

Typecode or data-type to which to cast the data.

casting{‘no’, ‘equiv’, ‘safe’, ‘same_kind’, ‘unsafe’}, optional

Controls what kind of data casting may occur. Defaults to ‘unsafe’ for backwards compatibility. ‘no’ means the data types should not be cast at all. ‘equiv’ means only byte-order changes are allowed. ‘safe’ means only casts which can preserve values are allowed. ‘same_kind’ means only safe casts or casts within a kind, like float64 to float32, are allowed. ‘unsafe’ means any data conversions may be done.

copybool, optional

If copy is False, the result might share some memory with this matrix. If copy is True, it is guaranteed that the result and this matrix do not share any memory.

ceil()

Element-wise ceil.

See numpy.ceil for more information.

check_format(full_check=True)

check whether the matrix format is valid

Parameters
full_checkbool, optional

If True, rigorous check, O(N) operations. Otherwise basic check, O(1) operations (default True).

conj(copy=True)

Element-wise complex conjugation.

If the matrix is of non-complex data type and copy is False, this method does nothing and the data is not copied.

Parameters
copybool, optional

If True, the result is guaranteed to not share data with self.

Returns
AThe element-wise complex conjugate.
conjugate(copy=True)

Element-wise complex conjugation.

If the matrix is of non-complex data type and copy is False, this method does nothing and the data is not copied.

Parameters
copybool, optional

If True, the result is guaranteed to not share data with self.

Returns
AThe element-wise complex conjugate.
copy()

Returns a copy of this matrix.

No data/indices will be shared between the returned value and current matrix.

count_nonzero()

Number of non-zero entries, equivalent to

np.count_nonzero(a.toarray())

Unlike getnnz() and the nnz property, which return the number of stored entries (the length of the data attribute), this method counts the actual number of non-zero entries in data.

deg2rad()

Element-wise deg2rad.

See numpy.deg2rad for more information.

diagonal(k=0)

Returns the kth diagonal of the matrix.

Parameters
kint, optional

Which diagonal to get, corresponding to elements a[i, i+k]. Default: 0 (the main diagonal).

New in version 1.0.

See also

numpy.diagonal

Equivalent numpy function.

Examples

>>> from scipy.sparse import csr_matrix
>>> A = csr_matrix([[1, 2, 0], [0, 0, 3], [4, 0, 5]])
>>> A.diagonal()
array([1, 0, 5])
>>> A.diagonal(k=1)
array([2, 3])
dot(other)

Ordinary dot product

Examples

>>> import numpy as np
>>> from scipy.sparse import csr_matrix
>>> A = csr_matrix([[1, 2, 0], [0, 0, 3], [4, 0, 5]])
>>> v = np.array([1, 0, -1])
>>> A.dot(v)
array([ 1, -3, -1], dtype=int64)
eliminate_zeros()

Remove zero entries from the matrix

This is an in place operation.

expm1()

Element-wise expm1.

See numpy.expm1 for more information.

floor()

Element-wise floor.

See numpy.floor for more information.

getH()

Return the Hermitian transpose of this matrix.

See also

numpy.matrix.getH

NumPy’s implementation of getH for matrices

get_shape()

Get shape of a matrix.

getcol(i)

Returns a copy of column i of the matrix, as a (m x 1) CSR matrix (column vector).

getformat()

Format of a matrix representation as a string.

getmaxprint()

Maximum number of elements to display when printed.

getnnz(axis=None)

Number of stored values, including explicit zeros.

Parameters
axisNone, 0, or 1

Select between the number of values across the whole matrix, in each column, or in each row.

See also

count_nonzero

Number of non-zero entries

getrow(i)

Returns a copy of row i of the matrix, as a (1 x n) CSR matrix (row vector).

property has_canonical_format

Determine whether the matrix has sorted indices and no duplicates

Returns
  • True: if the above applies

  • False: otherwise

has_canonical_format implies has_sorted_indices, so if the latter flag is False, so will the former be; if the former is found True, the latter flag is also set.

property has_sorted_indices

Determine whether the matrix has sorted indices

Returns
  • True: if the indices of the matrix are in sorted order

  • False: otherwise

log1p()

Element-wise log1p.

See numpy.log1p for more information.

max(axis=None, out=None)

Return the maximum of the matrix or maximum along an axis. This takes all elements into account, not just the non-zero ones.

Parameters
axis{-2, -1, 0, 1, None} optional

Axis along which the sum is computed. The default is to compute the maximum over all the matrix elements, returning a scalar (i.e., axis = None).

outNone, optional

This argument is in the signature solely for NumPy compatibility reasons. Do not pass in anything except for the default value, as this argument is not used.

Returns
amaxcoo_matrix or scalar

Maximum of a. If axis is None, the result is a scalar value. If axis is given, the result is a sparse.coo_matrix of dimension a.ndim - 1.

See also

min

The minimum value of a sparse matrix along a given axis.

numpy.matrix.max

NumPy’s implementation of ‘max’ for matrices

maximum(other)

Element-wise maximum between this and another matrix.

mean(axis=None, dtype=None, out=None)

Compute the arithmetic mean along the specified axis.

Returns the average of the matrix elements. The average is taken over all elements in the matrix by default, otherwise over the specified axis. float64 intermediate and return values are used for integer inputs.

Parameters
axis{-2, -1, 0, 1, None} optional

Axis along which the mean is computed. The default is to compute the mean of all elements in the matrix (i.e., axis = None).

dtypedata-type, optional

Type to use in computing the mean. For integer inputs, the default is float64; for floating point inputs, it is the same as the input dtype.

New in version 0.18.0.

outnp.matrix, optional

Alternative output matrix in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary.

New in version 0.18.0.

Returns
mnp.matrix

See also

numpy.matrix.mean

NumPy’s implementation of ‘mean’ for matrices

min(axis=None, out=None)

Return the minimum of the matrix or maximum along an axis. This takes all elements into account, not just the non-zero ones.

Parameters
axis{-2, -1, 0, 1, None} optional

Axis along which the sum is computed. The default is to compute the minimum over all the matrix elements, returning a scalar (i.e., axis = None).

outNone, optional

This argument is in the signature solely for NumPy compatibility reasons. Do not pass in anything except for the default value, as this argument is not used.

Returns
amincoo_matrix or scalar

Minimum of a. If axis is None, the result is a scalar value. If axis is given, the result is a sparse.coo_matrix of dimension a.ndim - 1.

See also

max

The maximum value of a sparse matrix along a given axis.

numpy.matrix.min

NumPy’s implementation of ‘min’ for matrices

minimum(other)

Element-wise minimum between this and another matrix.

multiply(other)

Point-wise multiplication by another matrix, vector, or scalar.

property nnz

Number of stored values, including explicit zeros.

See also

count_nonzero

Number of non-zero entries

nonzero()

nonzero indices

Returns a tuple of arrays (row,col) containing the indices of the non-zero elements of the matrix.

Examples

>>> from scipy.sparse import csr_matrix
>>> A = csr_matrix([[1,2,0],[0,0,3],[4,0,5]])
>>> A.nonzero()
(array([0, 0, 1, 2, 2]), array([0, 1, 2, 0, 2]))
power(n, dtype=None)

This function performs element-wise power.

Parameters
nn is a scalar
dtypeIf dtype is not specified, the current dtype will be preserved.
prune()

Remove empty space after all non-zero elements.

rad2deg()

Element-wise rad2deg.

See numpy.rad2deg for more information.

reshape(self, shape, order='C', copy=False)

Gives a new shape to a sparse matrix without changing its data.

Parameters
shapelength-2 tuple of ints

The new shape should be compatible with the original shape.

order{‘C’, ‘F’}, optional

Read the elements using this index order. ‘C’ means to read and write the elements using C-like index order; e.g., read entire first row, then second row, etc. ‘F’ means to read and write the elements using Fortran-like index order; e.g., read entire first column, then second column, etc.

copybool, optional

Indicates whether or not attributes of self should be copied whenever possible. The degree to which attributes are copied varies depending on the type of sparse matrix being used.

Returns
reshaped_matrixsparse matrix

A sparse matrix with the given shape, not necessarily of the same format as the current object.

See also

numpy.matrix.reshape

NumPy’s implementation of ‘reshape’ for matrices

resize(*shape)

Resize the matrix in-place to dimensions given by shape

Any elements that lie within the new shape will remain at the same indices, while non-zero elements lying outside the new shape are removed.

Parameters
shape(int, int)

number of rows and columns in the new matrix

Notes

The semantics are not identical to numpy.ndarray.resize or numpy.resize. Here, the same data will be maintained at each index before and after reshape, if that index is within the new bounds. In numpy, resizing maintains contiguity of the array, moving elements around in the logical matrix but not within a flattened representation.

We give no guarantees about whether the underlying data attributes (arrays, etc.) will be modified in place or replaced with new objects.

rint()

Element-wise rint.

See numpy.rint for more information.

set_shape(shape)

See reshape.

setdiag(values, k=0)

Set diagonal or off-diagonal elements of the array.

Parameters
valuesarray_like

New values of the diagonal elements.

Values may have any length. If the diagonal is longer than values, then the remaining diagonal entries will not be set. If values are longer than the diagonal, then the remaining values are ignored.

If a scalar value is given, all of the diagonal is set to it.

kint, optional

Which off-diagonal to set, corresponding to elements a[i,i+k]. Default: 0 (the main diagonal).

property shape

Get shape of a matrix.

sign()

Element-wise sign.

See numpy.sign for more information.

sin()

Element-wise sin.

See numpy.sin for more information.

sinh()

Element-wise sinh.

See numpy.sinh for more information.

sort_indices()

Sort the indices of this matrix in place

sorted_indices()

Return a copy of this matrix with sorted indices

sqrt()

Element-wise sqrt.

See numpy.sqrt for more information.

sum(axis=None, dtype=None, out=None)

Sum the matrix elements over a given axis.

Parameters
axis{-2, -1, 0, 1, None} optional

Axis along which the sum is computed. The default is to compute the sum of all the matrix elements, returning a scalar (i.e., axis = None).

dtypedtype, optional

The type of the returned matrix and of the accumulator in which the elements are summed. The dtype of a is used by default unless a has an integer dtype of less precision than the default platform integer. In that case, if a is signed then the platform integer is used while if a is unsigned then an unsigned integer of the same precision as the platform integer is used.

New in version 0.18.0.

outnp.matrix, optional

Alternative output matrix in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary.

New in version 0.18.0.

Returns
sum_along_axisnp.matrix

A matrix with the same shape as self, with the specified axis removed.

See also

numpy.matrix.sum

NumPy’s implementation of ‘sum’ for matrices

sum_duplicates()

Eliminate duplicate matrix entries by adding them together

This is an in place operation.

tan()

Element-wise tan.

See numpy.tan for more information.

tanh()

Element-wise tanh.

See numpy.tanh for more information.

toarray(order=None, out=None)

Return a dense ndarray representation of this matrix.

Parameters
order{‘C’, ‘F’}, optional

Whether to store multidimensional data in C (row-major) or Fortran (column-major) order in memory. The default is ‘None’, which provides no ordering guarantees. Cannot be specified in conjunction with the out argument.

outndarray, 2-D, optional

If specified, uses this array as the output buffer instead of allocating a new array to return. The provided array must have the same shape and dtype as the sparse matrix on which you are calling the method. For most sparse types, out is required to be memory contiguous (either C or Fortran ordered).

Returns
arrndarray, 2-D

An array with the same shape and containing the same data represented by the sparse matrix, with the requested memory order. If out was passed, the same object is returned after being modified in-place to contain the appropriate values.

tobsr(blocksize=None, copy=True)

Convert this matrix to Block Sparse Row format.

With copy=False, the data/indices may be shared between this matrix and the resultant bsr_matrix.

When blocksize=(R, C) is provided, it will be used for construction of the bsr_matrix.

tocoo(copy=True)

Convert this matrix to COOrdinate format.

With copy=False, the data/indices may be shared between this matrix and the resultant coo_matrix.

tocsc(copy=False)

Convert this matrix to Compressed Sparse Column format.

With copy=False, the data/indices may be shared between this matrix and the resultant csc_matrix.

tocsr(copy=False)

Convert this matrix to Compressed Sparse Row format.

With copy=False, the data/indices may be shared between this matrix and the resultant csr_matrix.

todense(order=None, out=None)

Return a dense matrix representation of this matrix.

Parameters
order{‘C’, ‘F’}, optional

Whether to store multi-dimensional data in C (row-major) or Fortran (column-major) order in memory. The default is ‘None’, which provides no ordering guarantees. Cannot be specified in conjunction with the out argument.

outndarray, 2-D, optional

If specified, uses this array (or numpy.matrix) as the output buffer instead of allocating a new array to return. The provided array must have the same shape and dtype as the sparse matrix on which you are calling the method.

Returns
arrnumpy.matrix, 2-D

A NumPy matrix object with the same shape and containing the same data represented by the sparse matrix, with the requested memory order. If out was passed and was an array (rather than a numpy.matrix), it will be filled with the appropriate values and returned wrapped in a numpy.matrix object that shares the same memory.

todia(copy=False)

Convert this matrix to sparse DIAgonal format.

With copy=False, the data/indices may be shared between this matrix and the resultant dia_matrix.

todok(copy=False)

Convert this matrix to Dictionary Of Keys format.

With copy=False, the data/indices may be shared between this matrix and the resultant dok_matrix.

tolil(copy=False)

Convert this matrix to List of Lists format.

With copy=False, the data/indices may be shared between this matrix and the resultant lil_matrix.

trace(offset=0)

Returns the sum along diagonals of the sparse matrix.

Parameters
offsetint, optional

Which diagonal to get, corresponding to elements a[i, i+offset]. Default: 0 (the main diagonal).

transpose(axes=None, copy=False)

Reverses the dimensions of the sparse matrix.

Parameters
axesNone, optional

This argument is in the signature solely for NumPy compatibility reasons. Do not pass in anything except for the default value.

copybool, optional

Indicates whether or not attributes of self should be copied whenever possible. The degree to which attributes are copied varies depending on the type of sparse matrix being used.

Returns
pself with the dimensions reversed.

See also

numpy.matrix.transpose

NumPy’s implementation of ‘transpose’ for matrices

trunc()

Element-wise trunc.

See numpy.trunc for more information.

Boosting

multimodal.boosting.mumbo

Multimodal Boosting

This module contains a MultiModal Boosting (MuMBo) estimator for classification implemented in the MumboClassifier class.

class multimodal.boosting.mumbo.MumboClassifier(base_estimator=None, n_estimators=50, random_state=None, best_view_mode='edge')

It then iterates the process on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases. A MuMBo classifier.

A MuMBo classifier is a meta-estimator that implements a multimodal (or multi-view) boosting algorithm:

It fits a set of classifiers on the original dataset splitted into several views and retains the classifier obtained for the best view.

This class implements the MuMBo algorithm [1].

Parameters
base_estimatorobject, optional (default=DecisionTreeClassifier)

Base estimator from which the boosted ensemble is built. Support for sample weighting is required, as well as proper classes_ and n_classes_ attributes. The default is a DecisionTreeClassifie with parameter max_depth=1.

n_estimatorsinteger, optional (default=50)

Maximum number of estimators at which boosting is terminated.

random_stateint, RandomState instance or None, optional (default=None)

If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

best_view_mode{“edge”, “error”}, optional (default=”edge”)

Mode used to select the best view at each iteration:

  • if best_view_mode == "edge", the best view is the view maximizing the edge value (variable δ (delta) in [1]),

  • if best_view_mode == "error", the best view is the view minimizing the classification error.

See also

sklearn.ensemble.AdaBoostClassifier
sklearn.ensemble.GradientBoostingClassifier
sklearn.tree.DecisionTreeClassifier

References

1(1,2)

Sokol Koço, “Tackling the uneven views problem with cooperation based ensemble learning methods”,

Examples

>>> from multimodal.boosting.mumbo import MumboClassifier
>>> from sklearn.datasets import load_iris
>>> X, y = load_iris(return_X_y=True)
>>> views_ind = [0, 2, 4]  # view 0: sepal data, view 1: petal data
>>> clf = MumboClassifier(random_state=0)
>>> clf.fit(X, y, views_ind)  
MumboClassifier(random_state=0)
>>> print(clf.predict([[ 5.,  3.,  1.,  1.]]))
[1]
>>> views_ind = [[0, 2], [1, 3]]  # view 0: length data, view 1: width data
>>> clf = MumboClassifier(random_state=0)
>>> clf.fit(X, y, views_ind)  
MumboClassifier(random_state=0)
>>> print(clf.predict([[ 5.,  3.,  1.,  1.]]))
[1]
>>> from sklearn.tree import DecisionTreeClassifier
>>> base_estimator = DecisionTreeClassifier(max_depth=2)
>>> clf = MumboClassifier(base_estimator=base_estimator, random_state=0)
>>> clf.fit(X, y, views_ind)  
MumboClassifier(base_estimator=DecisionTreeClassifier(max_depth=2),
                random_state=0)
>>> print(clf.predict([[ 5.,  3.,  1.,  1.]]))
[1]
Attributes
estimators_list of classifiers

Collection of fitted sub-estimators.

classes_numpy.ndarray, shape = (n_classes,)

Classes labels.

n_classes_int

Number of classes.

estimator_weights_numpy.ndarray of floats, shape = (len(estimators

Weights for each estimator in the boosted ensemble.

estimator_errors_array of floats

Empirical loss for each iteration.

best_views_numpy.ndarray of integers, shape = (len(estimators_),)

Indices of the best view for each estimator in the boosted ensemble.

property base_estimator_

Estimator used to grow the ensemble.

decision_function(X)

Compute the decision function of X.

Parameters
X{ array-like, sparse matrix},

shape = (n_samples, n_views * n_features) Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR. maybe also MultimodalData

Returns
dec_funnumpy.ndarray, shape = (n_samples, k)

Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with k == 1, otherwise k == n_classes. For binary classification, values <=0 mean classification in the first class in classes_ and values >0 mean classification in the second class in classes_.

property estimator_

Estimator used to grow the ensemble.

fit(X, y, views_ind=None)

Build a multimodal boosted classifier from the training set (X, y).

Parameters
Xdict dictionary with all views

or MultiModalData , MultiModalArray, MultiModalSparseArray or {array-like, sparse matrix}, shape = (n_samples, n_features) Training multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

yarray-like, shape = (n_samples,)

Target values (class labels).

views_indarray-like (default=[0, n_features//2, n_features])

Paramater specifying how to extract the data views from X:

  • If views_ind is a 1-D array of sorted integers, the entries indicate the limits of the slices used to extract the views, where view n is given by X[:, views_ind[n]:views_ind[n+1]].

    With this convention each view is therefore a view (in the NumPy sense) of X and no copy of the data is done.

  • If views_ind is an array of arrays of integers, then each array of integers views_ind[n] specifies the indices of the view n, which is then given by X[:, views_ind[n]].

    With this convention each view creates therefore a partial copy of the data in X. This convention is thus more flexible but less efficient than the previous one.

Returns
selfobject

Returns self.

get_params(deep=True)

Get parameters for this estimator.

Parameters
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
paramsdict

Parameter names mapped to their values.

predict(X)

Predict classes for X.

The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.

Parameters
X{array-like, sparse matrix}, shape = (n_samples, n_features)

Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

Returns
ynumpy.ndarray, shape = (n_samples,)

Predicted classes.

score(X, y)

Return the mean accuracy on the given test data and labels.

Parameters
X{array-like, sparse matrix} of shape = (n_samples, n_features)

Multi-view test samples. Sparse matrix can be CSC, CSR

yarray-like, shape = (n_samples,)

True labels for X.

Returns
scorefloat

Mean accuracy of self.predict(X) wrt. y.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters
**paramsdict

Estimator parameters.

Returns
selfestimator instance

Estimator instance.

staged_decision_function(X)

Compute decision function of X for each boosting iteration.

This method allows monitoring (i.e. determine error on testing set) after each boosting iteration.

Parameters
X{array-like, sparse matrix}, shape = (n_samples, n_features)

Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR. maybe also MultimodalData

Returns
dec_fungenerator of numpy.ndarrays, shape = (n_samples, k)

Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with k == 1, otherwise k==n_classes. For binary classification, values <=0 mean classification in the first class in classes_ and values >0 mean classification in the second class in classes_.

staged_predict(X)

Return staged predictions for X.

The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.

This generator method yields the ensemble prediction after each iteration of boosting and therefore allows monitoring, such as to determine the prediction on a test set after each boost.

Parameters
X{array-like, sparse matrix} of shape = (n_samples, n_features)

Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

Returns
ygenerator of numpy.ndarrays, shape = (n_samples,)

Predicted classes.

staged_score(X, y)

Return staged mean accuracy on the given test data and labels.

This generator method yields the ensemble score after each iteration of boosting and therefore allows monitoring, such as to determine the score on a test set after each boost.

Parameters
X{array-like, sparse matrix} of shape = (n_samples, n_features)

Multi-view test samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

yarray-like, shape = (n_samples,)

True labels for X.

Returns
scoregenerator of floats

Mean accuracy of self.staged_predict(X) wrt. y.

multimodal.boosting.combo

This module contains a MultiConfusion MMatrix Bosting (CoMBo) estimator for classification implemented in the MuComboClassifier class.

class multimodal.boosting.combo.MuComboClassifier(base_estimator=None, n_estimators=50, random_state=None)

It then iterates the process on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases. A MuCoMBo classifier.

A MuMBo classifier is a meta-estimator that implements a multimodal (or multi-view) boosting algorithm:

It fits a set of classifiers on the original dataset splitted into several views and retains the classifier obtained for the best view.

This class implements the MuMBo algorithm [1].

Parameters
base_estimatorobject, optional (default=DecisionTreeClassifier)

Base estimator from which the boosted ensemble is built. Support for sample weighting is required, as well as proper classes_ and n_classes_ attributes. The default is a DecisionTreeClassifier with parameter max_depth=1.

n_estimatorsinteger, optional (default=50)

Maximum number of estimators at which boosting is terminated.

random_stateint, RandomState instance or None, optional (default=None)

If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

See also

sklearn.ensemble.AdaBoostClassifier
sklearn.ensemble.GradientBoostingClassifier
sklearn.tree.DecisionTreeClassifier

References

1

Koc{c}o, Sokol and Capponi, C{'e}cile A Boosting Approach to Multiview Classification with Cooperation, 2011,Proceedings of the 2011 European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II, 209–228 Springer-Verlag https://link.springer.com/chapter/10.1007/978-3-642-23783-6_1

2

Sokol Koço, “Tackling the uneven views problem with cooperation based ensemble learning methods”, PhD Thesis, Aix-Marseille Université, 2013, http://www.theses.fr/en/2013AIXM4101.

Examples

>>> from multimodal.boosting.combo import MuComboClassifier
>>> from sklearn.datasets import load_iris
>>> X, y = load_iris(return_X_y=True)
>>> views_ind = [0, 2, 4]  # view 0: sepal data, view 1: petal data
>>> clf = MuComboClassifier(random_state=0)
>>> clf.fit(X, y, views_ind)  
MuComboClassifier(random_state=0)
>>> print(clf.predict([[ 5.,  3.,  1.,  1.]]))
[0]
>>> views_ind = [[0, 2], [1, 3]]  # view 0: length data, view 1: width data
>>> clf = MuComboClassifier(random_state=0)
>>> clf.fit(X, y, views_ind)  
MuComboClassifier(random_state=0)
>>> print(clf.predict([[ 5.,  3.,  1.,  1.]]))
[0]
>>> from sklearn.tree import DecisionTreeClassifier
>>> base_estimator = DecisionTreeClassifier(max_depth=2)
>>> clf = MuComboClassifier(base_estimator=base_estimator, random_state=1)
>>> clf.fit(X, y, views_ind)  
MuComboClassifier(base_estimator=DecisionTreeClassifier(max_depth=2),
                  random_state=1)
>>> print(clf.predict([[ 5.,  3.,  1.,  1.]]))
[0]
Attributes
estimators_list of classifiers

Collection of fitted sub-estimators.

classes_numpy.ndarray, shape = (n_classes,)

Classes labels.

n_classes_int

Number of classes.

n_views_int

Number of views

estimator_weights_numpy.ndarray of floats, shape = (len(estimators_),)

Weights for each estimator in the boosted ensemble.

estimator_errors_array of floats

Empirical loss for each iteration.

best_views_numpy.ndarray of integers, shape = (len(estimators_),)

Indices of the best view for each estimator in the boosted ensemble.

n_yi_numpy ndarray of int contains number of train sample for each classe shape (n_classes,)
property base_estimator_

Estimator used to grow the ensemble.

decision_function(X)

Compute the decision function of X.

Parameters
X{array-like, sparse matrix}, shape = (n_samples, n_features)

Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

Returns
dec_funnumpy.ndarray, shape = (n_view, n_samples, k)

Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with k == 1, otherwise k == n_classes. For binary classification, values <=0 mean classification in the first class in classes_ and values >0 mean classification in the second class in classes_.

property estimator_

Estimator used to grow the ensemble.

fit(X, y, views_ind=None)

Build a multimodal boosted classifier from the training set (X, y).

Parameters
Xdict dictionary with all views

or MultiModalData , MultiModalArray, MultiModalSparseArray or {array-like, sparse matrix}, shape = (n_samples, n_features) Training multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

yarray-like, shape = (n_samples,)

Target values (class labels).

views_indarray-like (default=[0, n_features//2, n_features])

Paramater specifying how to extract the data views from X:

  • If views_ind is a 1-D array of sorted integers, the entries indicate the limits of the slices used to extract the views, where view n is given by X[:, views_ind[n]:views_ind[n+1]].

    With this convention each view is therefore a view (in the NumPy sense) of X and no copy of the data is done.

  • If views_ind is an array of arrays of integers, then each array of integers views_ind[n] specifies the indices of the view n, which is then given by X[:, views_ind[n]].

    With this convention each view creates therefore a partial copy of the data in X. This convention is thus more flexible but less efficient than the previous one.

Returns
selfobject

Returns self.

Raises
ValueError estimator must support sample_weight
ValueError where X and view_ind are not compatibles
get_params(deep=True)

Get parameters for this estimator.

Parameters
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
paramsdict

Parameter names mapped to their values.

predict(X)

Predict classes for X.

The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.

Parameters
X{array-like, sparse matrix}, shape = (n_samples, n_features)

Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

Returns
ynumpy.ndarray, shape = (n_samples,)

Predicted classes.

Raises
ValueError ‘X’ input matrix must be have the same total number of features

of ‘X’ fit data

score(X, y)

Return the mean accuracy on the given test data and labels.

Parameters
X{array-like, sparse matrix} of shape = (n_samples, n_features)

Multi-view test samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

yarray-like, shape = (n_samples,)

True labels for X.

Returns
scorefloat

Mean accuracy of self.predict(X) wrt. y.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters
**paramsdict

Estimator parameters.

Returns
selfestimator instance

Estimator instance.

staged_decision_function(X)

Compute decision function of X for each boosting iteration.

This method allows monitoring (i.e. determine error on testing set) after each boosting iteration.

Parameters
X{array-like, sparse matrix}, shape = (n_samples, n_features)

Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

Returns
dec_fungenerator of numpy.ndarrays, shape = (n_samples, k)

Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with k == 1, otherwise k==n_classes. For binary classification, values <=0 mean classification in the first class in classes_ and values >0 mean classification in the second class in classes_.

staged_predict(X)

Return staged predictions for X.

The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.

This generator method yields the ensemble prediction after each iteration of boosting and therefore allows monitoring, such as to determine the prediction on a test set after each boost.

Parameters
X{array-like, sparse matrix} of shape = (n_samples, n_features)

Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

Returns
ygenerator of numpy.ndarrays, shape = (n_samples,)

Predicted classes.

staged_score(X, y)

Return staged mean accuracy on the given test data and labels.

This generator method yields the ensemble score after each iteration of boosting and therefore allows monitoring, such as to determine the score on a test set after each boost.

Parameters
X{array-like, sparse matrix} of shape = (n_samples, n_features)

Multi-view test samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

yarray-like, shape = (n_samples,)

True labels for X.

Returns
scoregenerator of floats

Mean accuracy of self.staged_predict(X) wrt. y.

multimodal.boosting.boost

class multimodal.boosting.boost.UBoosting

Abstract class MuComboClassifier and MumboClassifier should inherit from UBoosting for methods

Kernels

multimodal.kernels.mvml

class multimodal.kernels.mvml.MVML(lmbda=0.1, eta=1, nystrom_param=1.0, kernel='linear', kernel_params=None, learn_A=1, learn_w=0, precision=0.0001, n_loops=6)

The MVML Classifier

Parameters
lmbdafloat regression_params lmbda (default = 0.1) for basic regularization
etafloat regression_params eta (default = 1), first for basic regularization,

regularization of A (not necessary if A is not learned)

kernellist of str (default: “precomputed”) if kernel is as input of fit function set kernel to

“precomputed” list or str indicate the metrics used for each kernels list of pairwise kernel function name (default : “precomputed”) if kernel is as input of fit function set kernel to “precomputed” example : [‘rbf’, ‘additive_chi2’, ‘linear’ ] for function defined in as PAIRWISE_KERNEL_FUNCTIONS

kernel_paramslist of str defaultNone) list of dictionaries for parameters of kernel [{‘gamma’:50}

list of dict of corresponding kernels params KERNEL_PARAMS

nystrom_param: value between 0 and 1 indicating level of nyström approximation; 1 = no approximation
learn_Ainteger (default 1) choose if A is learned or not: 1 - yes (default);

2 - yes, sparse; 3 - no (MVML_Cov); 4 - no (MVML_I)

learn_winteger (default 0) where learn w is needed
precisionfloat (default1E-4) precision to stop algorithm
n_loops(default 6) number of iterions

Examples

>>> from multimodal.kernels.mvml import MVML
>>> from sklearn.datasets import load_iris
>>> X, y = load_iris(return_X_y=True)
>>> y[y>0] = 1
>>> views_ind = [0, 2, 4]  # view 0: sepal data, view 1: petal data
>>> clf = MVML()
>>> clf.get_params()
{'eta': 1, 'kernel': 'linear', 'kernel_params': None, 'learn_A': 1, 'learn_w': 0, 'lmbda': 0.1, 'n_loops': 6, 'nystrom_param': 1.0, 'precision': 0.0001}
>>> clf.fit(X, y, views_ind)  
MVML()
>>> print(clf.predict([[ 5.,  3.,  1.,  1.]]))
0
Attributes
lmbdafloat regression_params lmbda (default = 0.1)
etafloat regression_params eta (default = 1)
regression_paramsarray/list of regression parameters
kernellist or str indicate the metrics used for each kernels

list of pairwise kernel function name (default : “precomputed”) example : [‘rbf’, ‘additive_chi2’, ‘linear’ ] for function defined in as PAIRWISE_KERNEL_FUNCTIONS example kernel=[‘rbf’, ‘rbf’], for the first two views

kernel_params: list of dict of corresponding kernels params KERNEL_PARAMS
learn_A1 where Learn matrix A is needded
learn_winteger where learn w is needed
precisionfloat (default1E-4) precision to stop algorithm
n_loopsnumber of itterions
n_approxnumber of samples in approximation, equals n if no approx.
classes_array like unique label for classes
warning_messagedictionary with warning messages
X_metriclearning.datasets.data_sample.Metriclearn_array array of input sample
K_metriclearning.datasets.data_sample.Metriclearn_array array of processed kernels
y_array-like, shape = (n_samples,)

Target values (class labels).

regression_if the classifier is used as regression (defaultFalse)
decision_function(X)

Compute the decision function of X.

Parameters
X{ array-like, sparse matrix},

shape = (n_samples, n_views * n_features) Multi-view input samples. maybe also MultimodalData

Returns
dec_funnumpy.ndarray, shape = (n_samples, )

Decision function of the input samples. For binary classification, values <=0 mean classification in the first class in classes_ and values >0 mean classification in the second class in classes_.

fit(X, y=None, views_ind=None)

Fit the MVML classifier

Parameters
X- Metriclearn_array {array-like, sparse matrix}, shape = (n_samples, n_features)

Training multi-view input samples. can be also Kernel where attibute ‘kernel’ is set to precompute “precomputed”

or - Dictionary of {array like} with shape = (n_samples, n_features) for multi-view

for each view.

  • Array of {array like} with shape = (n_samples, n_features) for multi-view for each view.

  • {array like} with (n_samples, nviews * n_features) with ‘views_ind’ diferent to ‘None’

yarray-like, shape = (n_samples,)

Target values (class labels). array of length n_samples containing the classification/regression labels for training data

views_indarray-like (default=[0, n_features//2, n_features])

Paramater specifying how to extract the data views from X:

  • views_ind is a 1-D array of sorted integers, the entries indicate the limits of the slices used to extract the views, where view n is given by X[:, views_ind[n]:views_ind[n+1]] .

    With this convention each view is therefore a view (in the NumPy sense) of X and no copy of the data is done.

Returns
selfobject

Returns self.

get_params(deep=True)

Get parameters for this estimator.

Parameters
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
paramsdict

Parameter names mapped to their values.

predict(X)
Parameters
Xdifferent formats are supported
  • Metriclearn_array {array-like, sparse matrix}, shape = (n_samples, n_features) Training multi-view input samples. can be also Kernel where attibute ‘kernel’ is set to precompute “precomputed”

  • Dictionary of {array like} with shape = (n_samples, n_features) for multi-view for each view.

  • Array of {array like} with shape = (n_samples, n_features) for multi-view for each view.

  • {array like} with (n_samples, nviews * n_features) with ‘views_ind’ diferent to ‘None’

Returns
ynumpy.ndarray, shape = (n_samples,)

Predicted classes.

score(X, y)

Return the mean accuracy on the given test data and labels.

Parameters
X{array-like} of shape = (n_samples, n_features)
yarray-like, shape = (n_samples,)

True labels for X.

Returns
scorefloat

Mean accuracy of self.predict(X) wrt. y.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters
**paramsdict

Estimator parameters.

Returns
selfestimator instance

Estimator instance.

multimodal.kernels.lpMKL

class multimodal.kernels.lpMKL.MKL(lmbda, nystrom_param=1.0, kernel='linear', kernel_params=None, use_approx=True, precision=0.0001, n_loops=50)

MKL Classifier for multiview learning

Parameters
lmbdafloat coeficient for combined kernels
nystrom_paramfloat (default1.0)

value between 0 and 1 indicating level of nyström approximation; 1 = no approximation

kernellist of str (default: “precomputed”) if kernel is as input of fit function set kernel to

“precomputed” list or str indicate the metrics used for each kernels list of pairwise kernel function name (default : “precomputed”) if kernel is as input of fit function set kernel to “precomputed” example : [‘rbf’, ‘additive_chi2’, ‘linear’ ] for function defined in as PAIRWISE_KERNEL_FUNCTIONS

kernel_paramslist of str defaultNone) list of dictionaries for parameters of kernel [{‘gamma’:50}

list of dict of corresponding kernels params KERNEL_PARAMS

use_approx(defaultTrue) to use approximation of m_param < 1
n_loops(default 50) number of iterions
Attributes
lmbdafloat coeficient for combined kernels
m_paramfloat (default1.0)

value between 0 and 1 indicating level of nyström approximation; 1 = no approximation

kernellist or str indicate the metrics used for each kernels

list of pairwise kernel function name (default : “precomputed”) example : [‘rbf’, ‘additive_chi2’, ‘linear’ ] for function defined in as PAIRWISE_KERNEL_FUNCTIONS example kernel=[‘rbf’, ‘rbf’], for the first two views

kernel_params: list of dict of corresponding kernels params KERNEL_PARAMS
precisionfloat (default1E-4) precision to stop algorithm
n_loopsnumber of iterions
classes_array like unique label for classes
X_metriclearning.datasets.data_sample.Metriclearn_array array of input sample
K_metriclearning.datasets.data_sample.Metriclearn_array array of processed kernels
y_array-like, shape = (n_samples,)

Target values (class labels).

Clearning solution that is learned in MKL
weightslearned weight for combining the solutions of views, learned in
decision_function(X)

Compute the decision function of X.

Parameters
Xdict dictionary with all views {array like} with shape = (n_samples, n_features) for multi-view

for each view. or MultiModalData , MultiModalArray or {array-like,}, shape = (n_samples, n_features) Training multi-view input samples. can be also Kernel where attibute ‘kernel’ is set to precompute “precomputed”

Returns
dec_funnumpy.ndarray, shape = (n_samples, )

Decision function of the input samples. For binary classification, values <=0 mean classification in the first class in classes_ and values >0 mean classification in the second class in classes_.

fit(X, y=None, views_ind=None)
Parameters
Xdifferent formats are supported
  • Metriclearn_array {array-like, sparse matrix}, shape = (n_samples, n_features) Training multi-view input samples. can be also Kernel where attibute ‘kernel’ is set to precompute “precomputed”

  • Dictionary of {array like} with shape = (n_samples, n_features) for multi-view for each view.

  • Array of {array like} with shape = (n_samples, n_features) for multi-view for each view.

  • {array like} with (n_samples, nviews * n_features) with ‘views_ind’ diferent to ‘None’

yarray-like, shape = (n_samples,)

Target values (class labels). array of length n_samples containing the classification/regression labels for training data

views_indarray-like (default=[0, n_features//2, n_features])

Paramater specifying how to extract the data views from X:

  • views_ind is a 1-D array of sorted integers, the entries indicate the limits of the slices used to extract the views, where view n is given by X[:, views_ind[n]:views_ind[n+1]].

    With this convention each view is therefore a view (in the NumPy sense) of X and no copy of the data is done.

Returns
selfobject

Returns self.

get_params(deep=True)

Get parameters for this estimator.

Parameters
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
paramsdict

Parameter names mapped to their values.

learn_lpMKL()

function of lpMKL learning

Returns
return tuple (C, weights)
lpMKL_predict(X, C, weights)
Parameters
Xarray-like test kernels precomputed array like
Ccorresponding to Confusion learned matrix
weightslearned weights
Returns
ynumpy.ndarray, shape = (n_samples,)

Predicted classes.

predict(X)
Parameters
Xdict dictionary with all views {array like} with shape = (n_samples, n_features) for multi-view

for each view. or MultiModalData , MultiModalArray or {array-like,}, shape = (n_samples, n_features) Training multi-view input samples. can be also Kernel where attibute ‘kernel’ is set to precompute “precomputed”

views_indarray-like (default=[0, n_features//2, n_features])

Paramater specifying how to extract the data views from X:

  • views_ind is a 1-D array of sorted integers, the entries indicate the limits of the slices used to extract the views, where view n is given by X[:, views_ind[n]:views_ind[n+1]].

    With this convention each view is therefore a view (in the NumPy sense) of X and no copy of the data is done.

Returns
ynumpy.ndarray, shape = (n_samples,)

Predicted classes.

score(X, y)

Return the mean accuracy on the given test data and labels.

Parameters
Xdict dictionary with all views {array like} with shape = (n_samples, n_features) for multi-view

for each view. or MultiModalData , MultiModalArray or {array-like,}, shape = (n_samples, n_features) Training multi-view input samples. can be also Kernel where attibute ‘kernel’ is set to precompute “precomputed”

yarray-like, shape = (n_samples,)

True labels for X.

Returns
scorefloat

Mean accuracy of self.predict(X) wrt. y.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters
**paramsdict

Estimator parameters.

Returns
selfestimator instance

Estimator instance.

multimodal.kernels.mkernel

class multimodal.kernels.mkernel.MKernel

Abstract class MKL and MVML should inherit from for methods of transform kernel to/from data.

Attributes
W_sqrootinv_dictdict of nyström approximation kernel

in the case of nystrom approximation the a dictonary of reduced kernel is calculated

kernel_paramslist of dict of corresponding kernels

params KERNEL_PARAMS