data

class zfit.core.data.Data(dataset: Union[tensorflow.python.data.ops.dataset_ops.DatasetV2, LightDataset], obs: Union[str, Iterable[str], zfit.Space] = None, name: str = None, weights=None, iterator_feed_dict: Dict = None, dtype: tensorflow.python.framework.dtypes.DType = None)[source]

Bases: zfit.util.cache.Cachable, zfit.core.interfaces.ZfitData, zfit.core.dimension.BaseDimensional, zfit.core.baseobject.BaseObject

Create a data holder from a dataset used to feed into models.

Parameters
  • () (dtype) – A dataset storing the actual values

  • () – Observables where the data is defined in

  • () – Name of the Data

  • ()

  • ()

BATCH_SIZE = 1000000
add_cache_dependents(cache_dependents: Union[zfit.core.interfaces.ZfitCachable, Iterable[zfit.core.interfaces.ZfitCachable]], allow_non_cachable: bool = True)

Add dependents that render the cache invalid if they change.

Parameters
  • cache_dependents (ZfitCachable) –

  • allow_non_cachable (bool) – If True, allow cache_dependents to be non-cachables. If False, any cache_dependents that is not a ZfitCachable will raise an error.

Raises

TypeError – if one of the cache_dependents is not a ZfitCachable _and_ allow_non_cachable if False.

property axes

Return the axes, integer based identifier(indices) for the coordinate system.

convert_sort_space(obs: Union[str, Iterable[str], zfit.Space] = None, axes: Union[int, Iterable[int]] = None, limits: Union[zfit.core.interfaces.ZfitLimit, tensorflow.python.framework.ops.Tensor, numpy.ndarray, Iterable[float], float, Tuple[float], List[float], bool, None] = None) → Optional[zfit.core.space.Space][source]

Convert the inputs (using eventually obs, axes) to Space and sort them according to own obs.

Parameters
  • () (limits) –

  • ()

  • ()

Returns:

copy(deep: bool = False, name: str = None, **overwrite_params) → zfit.core.interfaces.ZfitObject
property data_range
property dtype
classmethod from_numpy(obs: Union[str, Iterable[str], zfit.Space], array: numpy.ndarray, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None)[source]

Create Data from a np.array.

Parameters
  • () (obs) –

  • array (numpy.ndarray) –

  • name (str) –

Returns

Return type

zfit.Data

classmethod from_pandas(df: pandas.core.frame.DataFrame, obs: Union[str, Iterable[str], zfit.Space] = None, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None)[source]

Create a Data from a pandas DataFrame. If obs is None, columns are used as obs.

Parameters
  • df (pandas.DataFrame) –

  • weights (tf.Tensor, None, np.ndarray, str]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents).

  • obs (zfit.Space) –

  • name (str) –

classmethod from_root(path: str, treepath: str, branches: List[str] = None, branches_alias: Dict = None, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray, str] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None, root_dir_options=None) → zfit.core.data.Data[source]

Create a Data from a ROOT file. Arguments are passed to uproot.

Parameters
  • path (str) –

  • treepath (str) –

  • branches (List[str]]) –

  • branches_alias (dict) – A mapping from the branches (as keys) to the actual observables (as values). This allows to have different observable names, independent of the branch name in the file.

  • weights (tf.Tensor, None, np.ndarray, str]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents). Can be a column of the ROOT file by using a string corresponding to a column.

  • name (str) –

  • () (root_dir_options) –

Returns

Return type

zfit.Data

classmethod from_root_iter(path, treepath, branches=None, entrysteps=None, name=None, **kwargs)[source]
classmethod from_tensor(obs: Union[str, Iterable[str], zfit.Space], tensor: tensorflow.python.framework.ops.Tensor, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None) → zfit.core.data.Data[source]

Create a Data from a tf.Tensor. Value simply returns the tensor (in the right order).

Parameters
  • obs (Union[str, List[str]) –

  • tensor (tf.Tensor) –

  • name (str) –

Returns

Return type

zfit.core.Data

graph_caching_methods = []
property has_weights
instances = <_weakrefset.WeakSet object>
property n_events
property n_obs

Return the number of observables, the dimensionality. Corresponds to the last dimension.

property name

The name of the object.

property nevents
numpy()[source]
property obs

Return the observables, string identifier for the coordinate system.

register_cacher(cacher: Union[zfit.core.interfaces.ZfitCachable, Iterable[zfit.core.interfaces.ZfitCachable]])

Register a cacher that caches values produces by this instance; a dependent.

Parameters

() (cacher) –

reset_cache(reseter: zfit.util.cache.ZfitCachable)
reset_cache_self()

Clear the cache of self and all dependent cachers.

set_data_range(data_range)[source]
set_weights(weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray])[source]

Set (temporarily) the weights of the dataset.

Parameters

weights (tf.Tensor, np.ndarray, None) –

sort_by_axes(axes: Union[int, Iterable[int]], allow_superset: bool = True)[source]
sort_by_obs(obs: Union[str, Iterable[str], zfit.Space], allow_superset: bool = False)[source]
property space
to_pandas(obs: Union[str, Iterable[str], zfit.Space] = None)[source]

Create a pd.DataFrame from obs as columns and return it.

Parameters

() (obs) – The observables to use as columns. If None, all observables are used.

Returns:

unstack_x(obs: Union[str, Iterable[str], zfit.Space] = None, always_list: bool = False)[source]

Return the unstacked data: a list of tensors or a single Tensor.

Parameters
  • () (obs) – which observables to return

  • always_list (bool) – If True, always return a list (also if length 1)

Returns

List(tf.Tensor)

value(obs: Union[str, Iterable[str], zfit.Space] = None)[source]
property weights
class zfit.core.data.LightDataset(tensor)[source]

Bases: object

batch(batch_size)[source]
classmethod from_tensor(tensor)[source]
value()[source]
class zfit.core.data.SampleData(dataset: Union[tensorflow.python.data.ops.dataset_ops.DatasetV2, LightDataset], sample_holder: tensorflow.python.framework.ops.Tensor, obs: Union[str, Iterable[str], zfit.Space] = None, weights=None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = tf.float64)[source]

Bases: zfit.core.data.Data

Create a data holder from a dataset used to feed into models.

Parameters
  • () (dtype) – A dataset storing the actual values

  • () – Observables where the data is defined in

  • () – Name of the Data

  • ()

  • ()

BATCH_SIZE = 1000000
add_cache_dependents(cache_dependents: Union[zfit.core.interfaces.ZfitCachable, Iterable[zfit.core.interfaces.ZfitCachable]], allow_non_cachable: bool = True)

Add dependents that render the cache invalid if they change.

Parameters
  • cache_dependents (ZfitCachable) –

  • allow_non_cachable (bool) – If True, allow cache_dependents to be non-cachables. If False, any cache_dependents that is not a ZfitCachable will raise an error.

Raises

TypeError – if one of the cache_dependents is not a ZfitCachable _and_ allow_non_cachable if False.

property axes

Return the axes, integer based identifier(indices) for the coordinate system.

convert_sort_space(obs: Union[str, Iterable[str], zfit.Space] = None, axes: Union[int, Iterable[int]] = None, limits: Union[zfit.core.interfaces.ZfitLimit, tensorflow.python.framework.ops.Tensor, numpy.ndarray, Iterable[float], float, Tuple[float], List[float], bool, None] = None) → Optional[zfit.core.space.Space]

Convert the inputs (using eventually obs, axes) to Space and sort them according to own obs.

Parameters
  • () (limits) –

  • ()

  • ()

Returns:

copy(deep: bool = False, name: str = None, **overwrite_params) → zfit.core.interfaces.ZfitObject
property data_range
property dtype
classmethod from_numpy(obs: Union[str, Iterable[str], zfit.Space], array: numpy.ndarray, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None)

Create Data from a np.array.

Parameters
  • () (obs) –

  • array (numpy.ndarray) –

  • name (str) –

Returns

Return type

zfit.Data

classmethod from_pandas(df: pandas.core.frame.DataFrame, obs: Union[str, Iterable[str], zfit.Space] = None, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None)

Create a Data from a pandas DataFrame. If obs is None, columns are used as obs.

Parameters
  • df (pandas.DataFrame) –

  • weights (tf.Tensor, None, np.ndarray, str]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents).

  • obs (zfit.Space) –

  • name (str) –

classmethod from_root(path: str, treepath: str, branches: List[str] = None, branches_alias: Dict = None, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray, str] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None, root_dir_options=None) → zfit.core.data.Data

Create a Data from a ROOT file. Arguments are passed to uproot.

Parameters
  • path (str) –

  • treepath (str) –

  • branches (List[str]]) –

  • branches_alias (dict) – A mapping from the branches (as keys) to the actual observables (as values). This allows to have different observable names, independent of the branch name in the file.

  • weights (tf.Tensor, None, np.ndarray, str]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents). Can be a column of the ROOT file by using a string corresponding to a column.

  • name (str) –

  • () (root_dir_options) –

Returns

Return type

zfit.Data

classmethod from_root_iter(path, treepath, branches=None, entrysteps=None, name=None, **kwargs)
classmethod from_sample(sample: tensorflow.python.framework.ops.Tensor, obs: Union[str, Iterable[str], zfit.Space], name: str = None, weights=None)[source]
classmethod from_tensor(obs: Union[str, Iterable[str], zfit.Space], tensor: tensorflow.python.framework.ops.Tensor, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None) → zfit.core.data.Data

Create a Data from a tf.Tensor. Value simply returns the tensor (in the right order).

Parameters
  • obs (Union[str, List[str]) –

  • tensor (tf.Tensor) –

  • name (str) –

Returns

Return type

zfit.core.Data

classmethod get_cache_counting()[source]
graph_caching_methods = []
property has_weights
instances = <_weakrefset.WeakSet object>
property n_events
property n_obs

Return the number of observables, the dimensionality. Corresponds to the last dimension.

property name

The name of the object.

property nevents
numpy()
property obs

Return the observables, string identifier for the coordinate system.

register_cacher(cacher: Union[zfit.core.interfaces.ZfitCachable, Iterable[zfit.core.interfaces.ZfitCachable]])

Register a cacher that caches values produces by this instance; a dependent.

Parameters

() (cacher) –

reset_cache(reseter: zfit.util.cache.ZfitCachable)
reset_cache_self()

Clear the cache of self and all dependent cachers.

set_data_range(data_range)
set_weights(weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray])

Set (temporarily) the weights of the dataset.

Parameters

weights (tf.Tensor, np.ndarray, None) –

sort_by_axes(axes: Union[int, Iterable[int]], allow_superset: bool = True)
sort_by_obs(obs: Union[str, Iterable[str], zfit.Space], allow_superset: bool = False)
property space
to_pandas(obs: Union[str, Iterable[str], zfit.Space] = None)

Create a pd.DataFrame from obs as columns and return it.

Parameters

() (obs) – The observables to use as columns. If None, all observables are used.

Returns:

unstack_x(obs: Union[str, Iterable[str], zfit.Space] = None, always_list: bool = False)

Return the unstacked data: a list of tensors or a single Tensor.

Parameters
  • () (obs) – which observables to return

  • always_list (bool) – If True, always return a list (also if length 1)

Returns

List(tf.Tensor)

value(obs: Union[str, Iterable[str], zfit.Space] = None)
property weights
class zfit.core.data.Sampler(dataset: zfit.core.data.LightDataset, sample_func: Callable, sample_holder: tensorflow.python.ops.variables.Variable, n: Union[int, float, complex, tensorflow.python.framework.ops.Tensor, Callable], weights=None, fixed_params: Dict[zfit.Parameter, Union[int, float, complex, tensorflow.python.framework.ops.Tensor]] = None, obs: Union[str, Iterable[str], zfit.Space] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = tf.float64)[source]

Bases: zfit.core.data.Data

Create a data holder from a dataset used to feed into models.

Parameters
  • () (dtype) – A dataset storing the actual values

  • () – Observables where the data is defined in

  • () – Name of the Data

  • ()

  • ()

BATCH_SIZE = 1000000
add_cache_dependents(cache_dependents: Union[zfit.core.interfaces.ZfitCachable, Iterable[zfit.core.interfaces.ZfitCachable]], allow_non_cachable: bool = True)

Add dependents that render the cache invalid if they change.

Parameters
  • cache_dependents (ZfitCachable) –

  • allow_non_cachable (bool) – If True, allow cache_dependents to be non-cachables. If False, any cache_dependents that is not a ZfitCachable will raise an error.

Raises

TypeError – if one of the cache_dependents is not a ZfitCachable _and_ allow_non_cachable if False.

property axes

Return the axes, integer based identifier(indices) for the coordinate system.

convert_sort_space(obs: Union[str, Iterable[str], zfit.Space] = None, axes: Union[int, Iterable[int]] = None, limits: Union[zfit.core.interfaces.ZfitLimit, tensorflow.python.framework.ops.Tensor, numpy.ndarray, Iterable[float], float, Tuple[float], List[float], bool, None] = None) → Optional[zfit.core.space.Space]

Convert the inputs (using eventually obs, axes) to Space and sort them according to own obs.

Parameters
  • () (limits) –

  • ()

  • ()

Returns:

copy(deep: bool = False, name: str = None, **overwrite_params) → zfit.core.interfaces.ZfitObject
property data_range
property dtype
classmethod from_numpy(obs: Union[str, Iterable[str], zfit.Space], array: numpy.ndarray, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None)

Create Data from a np.array.

Parameters
  • () (obs) –

  • array (numpy.ndarray) –

  • name (str) –

Returns

Return type

zfit.Data

classmethod from_pandas(df: pandas.core.frame.DataFrame, obs: Union[str, Iterable[str], zfit.Space] = None, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None)

Create a Data from a pandas DataFrame. If obs is None, columns are used as obs.

Parameters
  • df (pandas.DataFrame) –

  • weights (tf.Tensor, None, np.ndarray, str]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents).

  • obs (zfit.Space) –

  • name (str) –

classmethod from_root(path: str, treepath: str, branches: List[str] = None, branches_alias: Dict = None, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray, str] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None, root_dir_options=None) → zfit.core.data.Data

Create a Data from a ROOT file. Arguments are passed to uproot.

Parameters
  • path (str) –

  • treepath (str) –

  • branches (List[str]]) –

  • branches_alias (dict) – A mapping from the branches (as keys) to the actual observables (as values). This allows to have different observable names, independent of the branch name in the file.

  • weights (tf.Tensor, None, np.ndarray, str]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents). Can be a column of the ROOT file by using a string corresponding to a column.

  • name (str) –

  • () (root_dir_options) –

Returns

Return type

zfit.Data

classmethod from_root_iter(path, treepath, branches=None, entrysteps=None, name=None, **kwargs)
classmethod from_sample(sample_func: Callable, n: Union[int, float, complex, tensorflow.python.framework.ops.Tensor], obs: Union[str, Iterable[str], zfit.Space], fixed_params=None, name: str = None, weights=None, dtype=None)[source]
classmethod from_tensor(obs: Union[str, Iterable[str], zfit.Space], tensor: tensorflow.python.framework.ops.Tensor, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None) → zfit.core.data.Data

Create a Data from a tf.Tensor. Value simply returns the tensor (in the right order).

Parameters
  • obs (Union[str, List[str]) –

  • tensor (tf.Tensor) –

  • name (str) –

Returns

Return type

zfit.core.Data

classmethod get_cache_counting()[source]
graph_caching_methods = []
property has_weights
instances = <_weakrefset.WeakSet object>
property n_events
property n_obs

Return the number of observables, the dimensionality. Corresponds to the last dimension.

property n_samples
property name

The name of the object.

property nevents
numpy()
property obs

Return the observables, string identifier for the coordinate system.

register_cacher(cacher: Union[zfit.core.interfaces.ZfitCachable, Iterable[zfit.core.interfaces.ZfitCachable]])

Register a cacher that caches values produces by this instance; a dependent.

Parameters

() (cacher) –

resample(param_values: Mapping = None, n: Union[int, tensorflow.python.framework.ops.Tensor] = None)[source]

Update the sample by newly sampling. This affects any object that used this data already.

All params that are not in the attribute fixed_params will use their current value for the creation of the new sample. The value can also be overwritten for one sampling by providing a mapping with param_values from Parameter to the temporary value.

Parameters
  • param_values (Dict) – a mapping from Parameter to a value. For the current sampling, Parameter will use the value.

  • n (int, tf.Tensor) – the number of samples to produce. If the Sampler was created with anything else then a numerical or tf.Tensor, this can’t be used.

reset_cache(reseter: zfit.util.cache.ZfitCachable)
reset_cache_self()

Clear the cache of self and all dependent cachers.

set_data_range(data_range)
set_weights(weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray])

Set (temporarily) the weights of the dataset.

Parameters

weights (tf.Tensor, np.ndarray, None) –

sort_by_axes(axes: Union[int, Iterable[int]], allow_superset: bool = True)
sort_by_obs(obs: Union[str, Iterable[str], zfit.Space], allow_superset: bool = False)
property space
to_pandas(obs: Union[str, Iterable[str], zfit.Space] = None)

Create a pd.DataFrame from obs as columns and return it.

Parameters

() (obs) – The observables to use as columns. If None, all observables are used.

Returns:

unstack_x(obs: Union[str, Iterable[str], zfit.Space] = None, always_list: bool = False)

Return the unstacked data: a list of tensors or a single Tensor.

Parameters
  • () (obs) – which observables to return

  • always_list (bool) – If True, always return a list (also if length 1)

Returns

List(tf.Tensor)

value(obs: Union[str, Iterable[str], zfit.Space] = None)
property weights
zfit.core.data.feed_function(data, feed_val)
zfit.core.data.feed_function_for_partial_run(data)
zfit.core.data.fetch_function(data)