nmraspecds.dataset module
dataset module of the nmraspecds package.
- class nmraspecds.dataset.DatasetFactory
Bases:
DatasetFactory
Factory for creating dataset objects based on the source provided.
Particularly in case of recipe-driven data analysis (c.f.
aspecd.tasks
), there is a need to automatically retrieve datasets using nothing more than a source string that can be, e.g., a path or LOI.The DatasetFactory operates in conjunction with a
cwepr.io.factory.DatasetImporterFactory
to import the actual dataset. See the respective class documentation for more details.- importer_factory
ImporterFactory instance used for importing datasets
- Type:
cwepr.io.factory.DatasetImporterFactory
- get_dataset(source='', importer='', parameters=None)
Return dataset object for dataset specified by its source.
The import of data into the dataset is handled using an instance of
aspecd.io.DatasetImporterFactory
.The actual code for deciding which type of dataset to return in what case should be implemented in the non-public method
_create_dataset()
in any package based on the ASpecD framework.- Parameters:
source (
str
) –string describing the source of the dataset
May be a filename or path, a URL/URI, a LOI, or similar
importer (
str
) –Name of the importer to use for importing the dataset
Default: ‘’
Added in version 0.2.
parameters (
dict
) –Additional parameters for controlling the import
Default: None
Added in version 0.2.
- Returns:
dataset – Dataset object of appropriate class
- Return type:
- Raises:
aspecd.exceptions.MissingSourceError – Raised if no source is provided
aspecd.exceptions.MissingImporterFactoryError – Raised if no ImporterFactory is available
- class nmraspecds.dataset.ExperimentalDataset
Bases:
ExperimentalDataset
Set of data uniting all relevant information.
Core element of the package as all io, processing, analysis ans plotting steps are wrapped around a dataset which contains numerical data and metadata.
- add_reference(dataset=None)
Add a reference to another dataset to the list of references.
A reference is always an object of type
aspecd.dataset.DatasetReference
that will be automatically created from the dataset provided.- Parameters:
dataset (
aspecd.dataset.Dataset
) – dataset a reference for should be added to the list of references- Raises:
aspecd.exceptions.MissingDatasetError – Raised if no dataset was provided
- analyse(analysis_step=None)
Apply analysis to dataset.
Every analysis step is an object of type
aspecd.analysis.SingleAnalysisStep
and is passed as an argument toanalyse()
.The information necessary to reproduce an analysis is stored in the
analyses
attribute as object of classaspecd.dataset.AnalysisHistoryRecord
. This record contains as well a (deep) copy of the complete history of the dataset stored inhistory
.- Parameters:
analysis_step (
aspecd.analysis.SingleAnalysisStep
) – analysis step to apply to the dataset- Returns:
analysis_step – analysis step applied to the dataset
- Return type:
- analyze(analysis_step=None)
Apply analysis to dataset.
Same method as
analyse()
, but for those preferring AE over BE.
- annotate(annotation_=None)
Add annotation to dataset.
- Parameters:
annotation (
aspecd.annotation.DatasetAnnotation
) – annotation to add to the dataset
- append_history_record(history_record)
Append history record to dataset history.
This method should never be called manually, but only from within classes of the ASpecD framework, at least as long as you are not interested in Orwellian History.
- Parameters:
history_record (
aspecd.history.HistoryRecord
) – History record (of a processing step) to be appended.
Changed in version 0.2: Converted into a public method, due to needs of
aspecd.processing.MultiProcessingStep
- delete_analysis(index=None)
Remove analysis step record from dataset.
- Parameters:
index (int) – Number of analysis in analyses to delete
- delete_annotation(index=None)
Remove annotation record from dataset.
- Parameters:
index (int) – Number of analysis in analyses to delete
- delete_representation(index=None)
Remove representation record from dataset.
- Parameters:
index (int) – Number of analysis in analyses to delete
- export_to(exporter=None)
Export data and metadata.
This requires initialising an
aspecd.io.DatasetImporter
object first that is provided as an argument for this method.Note
The same operation can be performed by calling the
export_from()
method of anaspecd.io.Exporter
object taking anaspecd.dataset.Dataset
object as argument.However, as usually the dataset is already at hand, first creating an instance of a respective exporter and then calling
export_to()
of the dataset is the preferred way.- Parameters:
exporter (
aspecd.io.DatasetExporter
) – Exporter writing data and metadata to specific output format
- from_dict(dict_=None)
Set properties from dictionary.
Only parameters in the dictionary that are valid properties of the class are set accordingly.
Note
In conjunction with the
aspecd.dataset.to_dict()
method, this method allows to serialise and deserialise dataset objects, i.e. all kinds of storage to the persistence layer.- Parameters:
dict (
dict
) – Dictionary containing properties to set
- import_from(importer=None)
Import data and metadata contained in importer object.
This requires initialising an
aspecd.io.Importer
object first that is provided as an argument for this method.Note
The same operation can be performed by calling the
import_into()
method of anaspecd.io.Importer
object taking anaspecd.dataset.Dataset
object as argument.However, as usually one wants to continue working with a dataset, first creating an instance of a dataset and a respective importer and then calling
import_from()
of the dataset is the preferred way.- Parameters:
importer (
aspecd.io.DatasetImporter
) – Importer containing data and metadata read from some source
- load(filename=None)
Load dataset object from persistence layer.
The dataset will be loaded from a file conforming to the ASpecD dataset format (adf). For details, see the
aspecd.io.AdfExporter
class.
- property package_name
Return package name.
The name of the package the dataset is implemented in is a crucial detail for writing the history. The value is set automatically and is read-only.
- plot(plotter=None)
Perform plot with data of current dataset.
Every plotter is an object of type
aspecd.plotting.Plotter
and is passed as an argument toplot()
.The information necessary to reproduce a plot is stored in the
representations
attribute as object of classaspecd.dataset.PlotHistoryRecord
. This record contains as well a (deep) copy of the complete history of the dataset stored inhistory
. Besides being a necessary prerequisite to reproduce a plot, this allows to automatically recreate plots requiring different incompatible preprocessing steps in arbitrary order.- Parameters:
plotter (
aspecd.plotting.Plotter
) – plot to perform with data of current dataset- Returns:
plotter – plot performed on the current dataset
- Return type:
- Raises:
aspecd.exceptions.MissingPlotterError – Raised when trying to plot without plotter
- process(processing_step=None)
Apply processing step to dataset.
Every processing step is an object of type
aspecd.processing.SingleProcessingStep
and is passed as argument toprocess()
.Calling this function ensures that the history record is added to the dataset as well as a few basic checks are performed such as for leading history, meaning that the
_history_pointer
is not set to the current tip of the history of the dataset. In this case, an error is raised.Note
If processing_step is undoable, all previous plots stored in the list of representations will be removed, as these plots cannot be reproduced due to a change in
_origdata
.- Parameters:
processing_step (
aspecd.processing.SingleProcessingStep
) – processing step to apply to the dataset- Returns:
processing_step – processing step applied to the dataset
- Return type:
- Raises:
aspecd.exceptions.ProcessingWithLeadingHistoryError – Raised when trying to process with leading history
- redo()
Reapply previously undone processing step.
- Raises:
aspecd.exceptions.RedoAlreadyAtLatestChangeError – Raised when trying to redo with empty history
- remove_reference(dataset_id=None)
Remove a reference to another dataset from the list of references.
A reference is always an object of type
aspecd.dataset.DatasetReference
that was automatically created from the respective dataset when adding the reference.- Parameters:
dataset_id (
string
) – ID of the dataset the reference should be removed for- Raises:
aspecd.exceptions.MissingDatasetError – Raised if no dataset ID was provided
- save(filename=None)
Save dataset to persistence layer.
The dataset will be saved in ASpecD dataset format (adf). For details, see the
aspecd.io.AdfExporter
class.
- strip_history()
Remove leading history, if any.
If a dataset has a leading history, i.e., its history pointer does not point to the last entry of the history, and you want to perform a processing step on this very dataset, you need first to strip its history, as otherwise, a
ProcessingWithLeadingHistoryError
will be raised.
- tabulate(table=None)
Create table from data of current dataset.
Every table is an object of type
aspecd.table.Table
and is passed as an argument totabulate()
.The information necessary to reproduce a table is stored in the
representations
attribute as object of classaspecd.dataset.TableHistoryRecord
.- Parameters:
table (
aspecd.table.Table
) – table created from the data of the current dataset- Returns:
table – table created from the data of the current dataset
- Return type:
- Raises:
TypeError – Raised when trying to tabulate without table
- to_dict(remove_empty=False)
Create dictionary containing public attributes of an object.
- Parameters:
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns:
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type:
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
Changed in version 0.9.1: Dictionaries get copied before traversing, as otherwise, the special variables
__dict__
and__0dict__
are modified, what may result in strange behaviour.Changed in version 0.9.2: Dictionaries do not get copied by default, but there is a private method that can be overridden in derived classes to copy the dictionary.
- undo()
Revert last processing step.
Actually, the history pointer is decremented and starting from the
_origdata
, all processing steps are reapplied to the data up to this point in history.- Raises:
aspecd.exceptions.UndoWithEmptyHistoryError – Raised when trying to undo with empty history
aspecd.exceptions.UndoAtBeginningOfHistoryError – Raised when trying to undo with history pointer at zero
aspecd.exceptions.UndoStepUndoableError – Raised when trying to undo an undoable step of history
- class nmraspecds.dataset.CalculatedDataset
Bases:
CalculatedDataset
Base class for datasets containing calculated data.
- add_reference(dataset=None)
Add a reference to another dataset to the list of references.
A reference is always an object of type
aspecd.dataset.DatasetReference
that will be automatically created from the dataset provided.- Parameters:
dataset (
aspecd.dataset.Dataset
) – dataset a reference for should be added to the list of references- Raises:
aspecd.exceptions.MissingDatasetError – Raised if no dataset was provided
- analyse(analysis_step=None)
Apply analysis to dataset.
Every analysis step is an object of type
aspecd.analysis.SingleAnalysisStep
and is passed as an argument toanalyse()
.The information necessary to reproduce an analysis is stored in the
analyses
attribute as object of classaspecd.dataset.AnalysisHistoryRecord
. This record contains as well a (deep) copy of the complete history of the dataset stored inhistory
.- Parameters:
analysis_step (
aspecd.analysis.SingleAnalysisStep
) – analysis step to apply to the dataset- Returns:
analysis_step – analysis step applied to the dataset
- Return type:
- analyze(analysis_step=None)
Apply analysis to dataset.
Same method as
analyse()
, but for those preferring AE over BE.
- annotate(annotation_=None)
Add annotation to dataset.
- Parameters:
annotation (
aspecd.annotation.DatasetAnnotation
) – annotation to add to the dataset
- append_history_record(history_record)
Append history record to dataset history.
This method should never be called manually, but only from within classes of the ASpecD framework, at least as long as you are not interested in Orwellian History.
- Parameters:
history_record (
aspecd.history.HistoryRecord
) – History record (of a processing step) to be appended.
Changed in version 0.2: Converted into a public method, due to needs of
aspecd.processing.MultiProcessingStep
- delete_analysis(index=None)
Remove analysis step record from dataset.
- Parameters:
index (int) – Number of analysis in analyses to delete
- delete_annotation(index=None)
Remove annotation record from dataset.
- Parameters:
index (int) – Number of analysis in analyses to delete
- delete_representation(index=None)
Remove representation record from dataset.
- Parameters:
index (int) – Number of analysis in analyses to delete
- export_to(exporter=None)
Export data and metadata.
This requires initialising an
aspecd.io.DatasetImporter
object first that is provided as an argument for this method.Note
The same operation can be performed by calling the
export_from()
method of anaspecd.io.Exporter
object taking anaspecd.dataset.Dataset
object as argument.However, as usually the dataset is already at hand, first creating an instance of a respective exporter and then calling
export_to()
of the dataset is the preferred way.- Parameters:
exporter (
aspecd.io.DatasetExporter
) – Exporter writing data and metadata to specific output format
- from_dict(dict_=None)
Set properties from dictionary.
Only parameters in the dictionary that are valid properties of the class are set accordingly.
Note
In conjunction with the
aspecd.dataset.to_dict()
method, this method allows to serialise and deserialise dataset objects, i.e. all kinds of storage to the persistence layer.- Parameters:
dict (
dict
) – Dictionary containing properties to set
- import_from(importer=None)
Import data and metadata contained in importer object.
This requires initialising an
aspecd.io.Importer
object first that is provided as an argument for this method.Note
The same operation can be performed by calling the
import_into()
method of anaspecd.io.Importer
object taking anaspecd.dataset.Dataset
object as argument.However, as usually one wants to continue working with a dataset, first creating an instance of a dataset and a respective importer and then calling
import_from()
of the dataset is the preferred way.- Parameters:
importer (
aspecd.io.DatasetImporter
) – Importer containing data and metadata read from some source
- load(filename=None)
Load dataset object from persistence layer.
The dataset will be loaded from a file conforming to the ASpecD dataset format (adf). For details, see the
aspecd.io.AdfExporter
class.
- property package_name
Return package name.
The name of the package the dataset is implemented in is a crucial detail for writing the history. The value is set automatically and is read-only.
- plot(plotter=None)
Perform plot with data of current dataset.
Every plotter is an object of type
aspecd.plotting.Plotter
and is passed as an argument toplot()
.The information necessary to reproduce a plot is stored in the
representations
attribute as object of classaspecd.dataset.PlotHistoryRecord
. This record contains as well a (deep) copy of the complete history of the dataset stored inhistory
. Besides being a necessary prerequisite to reproduce a plot, this allows to automatically recreate plots requiring different incompatible preprocessing steps in arbitrary order.- Parameters:
plotter (
aspecd.plotting.Plotter
) – plot to perform with data of current dataset- Returns:
plotter – plot performed on the current dataset
- Return type:
- Raises:
aspecd.exceptions.MissingPlotterError – Raised when trying to plot without plotter
- process(processing_step=None)
Apply processing step to dataset.
Every processing step is an object of type
aspecd.processing.SingleProcessingStep
and is passed as argument toprocess()
.Calling this function ensures that the history record is added to the dataset as well as a few basic checks are performed such as for leading history, meaning that the
_history_pointer
is not set to the current tip of the history of the dataset. In this case, an error is raised.Note
If processing_step is undoable, all previous plots stored in the list of representations will be removed, as these plots cannot be reproduced due to a change in
_origdata
.- Parameters:
processing_step (
aspecd.processing.SingleProcessingStep
) – processing step to apply to the dataset- Returns:
processing_step – processing step applied to the dataset
- Return type:
- Raises:
aspecd.exceptions.ProcessingWithLeadingHistoryError – Raised when trying to process with leading history
- redo()
Reapply previously undone processing step.
- Raises:
aspecd.exceptions.RedoAlreadyAtLatestChangeError – Raised when trying to redo with empty history
- remove_reference(dataset_id=None)
Remove a reference to another dataset from the list of references.
A reference is always an object of type
aspecd.dataset.DatasetReference
that was automatically created from the respective dataset when adding the reference.- Parameters:
dataset_id (
string
) – ID of the dataset the reference should be removed for- Raises:
aspecd.exceptions.MissingDatasetError – Raised if no dataset ID was provided
- save(filename=None)
Save dataset to persistence layer.
The dataset will be saved in ASpecD dataset format (adf). For details, see the
aspecd.io.AdfExporter
class.
- strip_history()
Remove leading history, if any.
If a dataset has a leading history, i.e., its history pointer does not point to the last entry of the history, and you want to perform a processing step on this very dataset, you need first to strip its history, as otherwise, a
ProcessingWithLeadingHistoryError
will be raised.
- tabulate(table=None)
Create table from data of current dataset.
Every table is an object of type
aspecd.table.Table
and is passed as an argument totabulate()
.The information necessary to reproduce a table is stored in the
representations
attribute as object of classaspecd.dataset.TableHistoryRecord
.- Parameters:
table (
aspecd.table.Table
) – table created from the data of the current dataset- Returns:
table – table created from the data of the current dataset
- Return type:
- Raises:
TypeError – Raised when trying to tabulate without table
- to_dict(remove_empty=False)
Create dictionary containing public attributes of an object.
- Parameters:
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns:
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type:
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
Changed in version 0.9.1: Dictionaries get copied before traversing, as otherwise, the special variables
__dict__
and__0dict__
are modified, what may result in strange behaviour.Changed in version 0.9.2: Dictionaries do not get copied by default, but there is a private method that can be overridden in derived classes to copy the dictionary.
- undo()
Revert last processing step.
Actually, the history pointer is decremented and starting from the
_origdata
, all processing steps are reapplied to the data up to this point in history.- Raises:
aspecd.exceptions.UndoWithEmptyHistoryError – Raised when trying to undo with empty history
aspecd.exceptions.UndoAtBeginningOfHistoryError – Raised when trying to undo with history pointer at zero
aspecd.exceptions.UndoStepUndoableError – Raised when trying to undo an undoable step of history