This page focuses on the software architecture of AqNWB for implementing data recording and is mainly aimed at software developers. The recording system in AqNWB is built around several key concepts:
- Efficient data recording for individual datasets via BaseRecordingData objects discussed in Recording datasets with BaseRecordingData
- Consistent multi-dataset recording through convenience methods defined on individual RegisteredType objects (e.g., TimeSeries::writeData) discussed in TimeSeries Convenience Methods for Consistent Recording
- Managing collections of recording objects through RecordingObjects, discussed in RecordingObjects for Managing Collections
Recording datasets with BaseRecordingData
AqNWB records datasets efficiently via BaseRecordingData objects. The main components involved in writing data to an NWB file via AqNWB are:
- DEFINE_DATASET_FIELD Macro
- BaseRecordingData
- BaseRecordingData is a class that manages the recording process for a dataset.
- It keeps track of the current position in the dataset where data should be written next via the m_position member.
- It provides methods for writing data blocks to the dataset, such as writeDataBlock, which can handle different data types and dimensions.
- RegisteredType
- RegisteredType maintains a cache of BaseRecordingData objects via the m_recordingDataCache member. This cache allows reusing the same BaseRecordingData object when it is requested multiple times, improving performance and retaining the recording position. The cache is essential for writing data to the dataset in a streaming fashion, as it ensures that each write continues from where the previous write left off. The cache also avoids the need for manually maintaining the objects and allows caching of an arbitrary number of BaseRecordingData object such that the individual neurodata_type classes do not need to worry about maintaining their recording state.
- BaseIO
- RecordingObjects
The DEFINE_DATASET_FIELD Macro for Recording
The DEFINE_DATASET_FIELD macro not only defines methods for reading datasets but also for recording to them. For each dataset field defined with this macro, a corresponding method is generated that returns a BaseRecordingData object configured for that specific dataset.
For example, if we have a TimeSeries class with a 'data' field defined using the DEFINE_DATASET_FIELD macro:
#define DEFINE_DATASET_FIELD(readName, writeName, default_type, fieldPath, description)
Defines a lazy-loaded dataset field accessor function.
Definition RegisteredType.hpp:573
This generates not only a readData() method for reading the dataset but also a recordData() method that returns a BaseRecordingData object configured for writing to the 'data' dataset.
The generated recordData() method:
- Checks if a BaseRecordingData object for the dataset already exists in the cache
- If it exists and reset is false, returns the cached object
- If it doesn't exist or reset is true, gets a new BaseRecordingData object from the IO backend
- Calls RegisteredType::registerRecordingObject to register the current RegisteredType object for recording with the I/O to make sure the object is being finalized at the end of the recording.
- Caches the new object and returns it
This caching mechanism is crucial for maintaining the recording state across multiple writes to the same dataset.
BaseRecordingData for Managing Recording
The BaseRecordingData class is responsible for managing the recording process for a dataset. It keeps track of the current position in the dataset where data should be written next, ensuring that data is written efficiently, especially for streaming data where multiple writes occur over time.
Key features of BaseRecordingData include:
- Position Tracking: BaseRecordingData keeps track of the current position in the dataset via the m_position member. This is particularly important for streaming data, where data is written in chunks over time.
- Data Type Handling: BaseRecordingData can handle different data types and dimensions through its writeDataBlock methods, making it flexible for various types of data.
TimeSeries Convenience Methods for Consistent Recording
Specific types like TimeSeries provide convenience methods for writing multiple datasets in a consistent manner. This ensures that related datasets (e.g., 'data' and 'timestamps' in a TimeSeries) are written consistently and simplifies the recording process.
The TimeSeries class provides:
- An initialize method that sets up all the necessary datasets and attributes for a time series, including data, timestamps, control and all their attributes, e.g., unit
- A writeData method that writes data, timestamps, and control information in a single call, ensuring consistency between these related datasets.
These convenience methods handle the details of:
- Dataset Creation: Creating the necessary datasets if they don't exist.
- Data Alignment: Ensuring that related datasets (e.g., data and timestamps) are properly aligned.
- Position Management: Managing the current position in each dataset to ensure consistent writing.
- Error Handling: Handling errors that might occur during the writing process.
RecordingObjects for Managing Collections
RecordingObjects provides an additional convenience layer for managing collections of RegisteredType Containers used for recording. This is particularly useful when recording data to multiple related containers, such as multiple TimeSeries objects.
RecordingObjects simplifies the process of:
NWB I/O convenience utilities
The src/io/nwbio_utils.hpp module provides convenience methods to help coordinating the recording process across multiple containers through specialized methods like:
Object and memory management
- The I/O owns a std::shared_ptr<RecordingObjects> m_recording_objects smart pointer to its RecordingObjects
- The RecordingObjects instance in turn tracks the RegisteredType objects used for recroding with the I/O object by owning the RegisteredType objects stored in std::vector<std::shared_ptr<AQNWB::NWB::RegisteredType>> m_recording_objects
- RegisteredType objects in turn have a std::weak_ptr<IO::BaseIO> m_io weak pointer to the I/O.
- Note
- RegisteredType::m_io is a std::weak_ptr to the I/O to avoid circular referencing between the I/O-->RecordingObjects-->RegisteredType and back to the I/O, which would prevent correct clean-up of the reference counted smart pointers, which would lead to memory leaks. This also means, that the RegisteredType object does not own or keep the I/O alive, so we need to check first that the I/O is valid when we use it.
In this logic, the I/O essentially owns the recording process and the user in turn owns the I/O object. To ensure reliable management of the objects, the constructors of all RegisteredType classes are protected, requiring the use of the RegisteredType::create factory methods. This ensures that all instance of RegisteredType and its subclasses are created as smart std::shared_ptr pointers to facilitate reliable memory management.
- Note
- Registration with the RecordingObjects of the I/O occurs when a RegisteredType is being created via the static RegisteredType::create factory methods.
Further Reading