aqnwb 0.1.0
Loading...
Searching...
No Matches
Implementing a new Neurodata Type

New neurodata_types typically inherit from at least either Container or Data, or a more specialized type of the two. In any case, all classes that represent a neurodata_type defined in the schema should be implemented as a subtype of RegisteredType. Here we focus on how to implement new subclasses of RegisteredType. If you want to learn more about the how RegisteredType manages types and implements data read then please see Implementation of data read.

How to Implement a RegisteredType

To implement a subclass of RegisteredType, follow these steps:

  1. Include the RegisteredType.hpp header file in your subclass header file (or the header of your more specific parent class that inherits from RegisteredType).
  2. Define your subclass by inheriting from RegisteredType (or one of its child classes). Ensure that your subclass implements a constructor with the arguments (const std::string& path, std::shared_ptr<IO::BaseIO> io), as the RegisteredType::create method expects this constructor signature.
    class MySubClass : public AQNWB::NWB::RegisteredType {
    public:
    MySubClass(const std::string& path, std::shared_ptr<IO::BaseIO> io)
    : RegisteredType(path, io) {}
    // Implement any additional methods or overrides here
    };
    Base class for types defined in the NWB schema.
    Definition RegisteredType.hpp:48
  3. Use the REGISTER_SUBCLASS macro to prepare your subclass for registration with the class registry defined by RegisteredType. This should usually appear in the header (hpp) file as part of the class definition:
    REGISTER_SUBCLASS(MySubClass, "my-namespace")
    #define REGISTER_SUBCLASS(T, NAMESPACE)
    Macro to register a subclass with the RegisteredType class registry.
    Definition RegisteredType.hpp:373
  4. In the corresponding source (cpp) file, initialize the static member to trigger the registration using the REGISTER_SUBCLASS_IMPL macro:
    #include "MySubClass.h"
    // Initialize the static member to trigger registration
    #define REGISTER_SUBCLASS_IMPL(T)
    Macro to initialize the static member registered_ to trigger registration.
    Definition RegisteredType.hpp:385
  5. To define getter methods for lazy read access to datasets and attributes that belong to our type, we can use the DEFINE_FIELD macro. This macro creates a standard method for retrieving a ReadDataWrapper for lazy reading for the field:
    DEFINE_FIELD(getData, DatasetField, float, "data", The main data)
    #define DEFINE_FIELD(name, storageObjectType, default_type, fieldPath, description)
    Defines a lazy-loaded field accessor function.
    Definition RegisteredType.hpp:407
  6. Similarly, we use the DEFINE_REGISTERED_FIELD macro to define getter methods for other RegisteredType objects that we own, such as a ElectrodeTable that owns predefined VectorData columns:
    DEFINE_REGISTERED_FIELD(readGroupNameColumn, VectorData<std::string>, "group_name", "the name of the ElectrodeGroup")
    #define DEFINE_REGISTERED_FIELD(name, registeredType, fieldPath, description)
    Defines a lazy-loaded accessor function for reading fields that are RegisteredTypes.
    Definition RegisteredType.hpp:439
  7. When inheriting from the more specific Container or Data types, then we will typically also need to implement an initialize method, which is responsible for creating the relevant Groups, Datasets, and Attributes in the file for data write. Remember to also call the initialize method of the parent class.
Note
DEFINE_FIELD and DEFINE_REGISTERED_FIELD create templated, non-virtual read functions. This means if we want to "redefine" a field in a child class by calling DEFINE_FIELD again, then the function will be "hidden" instead of "override". This is important to remember when casting a pointer to a base type, as in this case the implementation from the base type will be used since the function created by DEFINE_FIELD is not virtual.

Example: Implementing a new type

MySubClass.hpp

#pragma once
class MySubClass : public AQNWB::NWB::RegisteredType
{
public:
MySubClass(const std::string& path, std::shared_ptr<IO::BaseIO> io)
: RegisteredType(path, io) {}
DEFINE_FIELD(getData, DatasetField, float, "data", The main data)
REGISTER_SUBCLASS(MySubClass, "my-namespace")
};

MySubClass.cpp

#include "MySubClass.h"
// Initialize the static member to trigger registration
Warning
To ensure proper function on read, the name of the class should match the name of the neurodata_type as defined in the schema. Similarly, "my-namespace" should match the name of the namespace in the schema (e.g., "core", "hdmf-common"). In this way we can look up the corresponding class for an object in a file based on the neurodata_type and namespace attributes stored in the file. A special version of the REGISTER_SUBCLASS macro, called REGISTER_SUBCLASS_WITH_TYPENAME, allows setting the typename explicitly as a third argument. This is for the special case where the name of the class cannot be the same as the name of the type (e.g,. when implementing a class that doesn't have an assigned type in the schema or a class that requires template parameters that are not part of the type name). See How to implement a RegisteredType with a custom type name for details.

DEFINE_FIELD: Creating read methods for datasets and attributes

The DEFINE_FIELD macro takes the following main inputs:

  • name: The name of the function to generate.
  • storageObjectType : One of either DatasetField or AttributeField to define the type of storage object used to store the field.
  • default_type : The default data type to use. If not known, we can use std::any.
  • fieldPath : Literal string with the relative path to the field within the schema of the respective neurodata_type. This is automatically being expanded at runtime to the full path.
  • description : Description of the field to include in the docstring for the docs

All of these inputs are required. A typical example will look as follows:

DEFINE_FIELD(getData, DatasetField, float, "data", The main data)

The compiler will then expand this definition to create a new method (here called getData) that will return a ReadDataWrapper for lazy reading for the field. The corresponding expanded function will look something like:

template<typename VTYPE = float>
inline std::unique_ptr<IO::ReadDataWrapper<DatasetField, VTYPE>> getData() const
{
return std::make_unique<IO::ReadDataWrapper<DatasetField, VTYPE>>(
m_io,
AQNWB::mergePaths(m_path, fieldPath));
}
static std::string mergePaths(const std::string &path1, const std::string &path2)
Merge two paths into a single path, handling extra trailing and starting "/".
Definition Utils.hpp:112

See Reading data for an example of how to use such methods (e.g., TimeSeries::readData ) for reading data fields from a file.

DEFINE_REGISTERED_FIELD: Defining read methods for neurodata_type objects

The DEFINE_REGISTERED_FIELD macro works much like the DEFINE_FIELD macro macro but returns instances of specific subtypes of RegisteredType, rather than ReadDataWrapper. As such the main inputs for DEFINE_REGISTERED_FIELD are as follows:

  • name: The name of the function to generate.
  • registeredType : The specific subclass of RegisteredType to use
  • fieldPath : Literal string with the relative path to the field within the schema of the respective neurodata_type. This is automatically being expanded at runtime to the full path.
  • description : Description of the field to include in the docstring for the docs

All of these inputs are required. A typical example will look as follows:

DEFINE_REGISTERED_FIELD(getMyTable, DynamicTable, "my_table", My data table)

The compiler will then expand this definition to create a new read method, in this called getMyTable that returns a DynamicTable for reading "my_table". The corresponding expanded function will look something like:

template<typename RTYPE = DynamicTable>
inline std::shared_ptr<RTYPE> getMyTable() const
{
std::string objectPath = AQNWB::mergePaths(m_path, fieldPath);
if (m_io->objectExists(objectPath)) {
return RegisteredType::create<RTYPE>(objectPath, m_io);
}
return nullptr;
}

DEFINE_REFERENCED_REGISTERED_FIELD: Defining read methods for references to neurodata_type objects

The DEFINE_REFERENCED_REGISTERED_FIELD macro works exactly like the DEFINE_REGISTERED_FIELD macro, but the underlying data is an attribute that stores a reference to an instances of a specific subtype of RegisteredType rather than the instance of the object directly. I.e., fieldPath here is the relative path to the attribute that stores the reference, rather than the relative path of the object itself. The generated read method then resolves the reference first and then returns the instance of the object that is being referenced.

How to implement a RegisteredType with a custom type name

In most cases, the name of our RegisteredType class should be the same as the neurodata_type. However, in some cases this may not be possible. In this case, we need to use REGISTER_SUBCLASS_WITH_TYPENAME macro instead of REGISTER_SUBCLASS. E.g. using REGISTER_SUBCLASS_WITH_TYPENAME(ElectrodeTable, "core", "DynamicTable"), the class will be registered in the registry, under the core::ElectrodesTable key, but with "DynamicTable" as the typename value and the ElectrodesTable.getTypeName automatic override returning the indicated typename instead of the classname. The main use cases for this are to implement:

  1. Templated child classes of RegisteredType where the template parameters required in C++ are not part of the neurodata_type name in NWB. An example is VectorData which uses a template parameter to define the data type of data that is manages.
  2. A class for a modified type that does not have its own neurodata_type in the NWB schema. An example is ElectrodesTable in NWB <v2.7, which did not have an assigned neurodata_type, but was implemented as a regular DynamicTable. To allow us to define a class ElectrodeTable to help with writing the table we can then use REGISTER_SUBCLASS_WITH_TYPENAME(ElectrodeTable, "core", "DynamicTable") in the ElectrodeTable class. This ensures that the neurodata_type attribute is set correctly to DynamicTable on write instead of ElectrodeTable.

Templated RegisteredType Classes

In some cases, we may want to use templated classes to handle data types in a type-safe way. AqNWB uses templated neurodata_type classes for VectorDataTyped<DTYPE> and DataTyped<DTYPE> as these classes manage a particular dataset. Using this approach, we specify the data types to use with the class directly as part of DEFINE_REGISTERED_FIELD macro so that the user doesn't need to manually specify the data type for read. To implement the use of templated classes for read, we can take two main approaches by either Using a base class and templated child class or Using a single templated class .

Using a base class and templated child class

For the VectorData type (and Data type), AqNWB implements the VectorData class, which exposes the data as std::any via the VectorData::readData method for read. To simplify read, VectorDataTyped<DTYPE> inherits from VectorData but allows the data type to be fixed at compile time via the class template, such that VectorDataTyped<DTYPE>::readData can expose the data with the type already set at compile time.

Using this approach where we have a non-templated base class VectorData with a templated child class VectorDataTyped<DTYPE>, only the base type VectorData is being registered with the RegisteredType registry via the REGISTER_SUBCLASS_IMPL macro. This is because on read RegisteredType::create can only determine the base type based on the namespace and neurodata_type attribute stored in the file.

However, even though VectorDataTyped<DTYPE> is not being added to the RegisteredType registry, it does inherit from VectorData and as such, a user may chose to use
VectorDataTyped<DTYPE> anywhere VectorData is being used. In particular, by using VectorDataTyped<DTYPE> as part of the DEFINE_REGISTERED_FIELD macro, we can set the data type for read at compile time, simplifying read.

Note
The std::unique_ptr<..> template type is not covariant, i.e., std::unique_ptr<DerivedClass> does not automatically convert to std::unique_ptr<BaseClass>. I.e., while VectorDataTyped<DTYPE> can be used anywhere VectorData is being used, when using std::unique_ptr<..> we cannot rely on the compiler to automatically upcast for us, but we will need to explicitly release and upcast std::unique_ptr<VectorDataTyped<DTYPE> if std::unique_ptr<VectorData> is required. Since defining the DTYPE is primarily useful for read, we therefore typically use VectorDataTyped<DTYPE> on read while using VectorData otherwise.

Using a single templated class

Alternatively to the above approach, where we have two classes VectorData and VectorDataTyped<DTYPE>, we could also just use a single templated class VectorData<DTYPE> and then register only the generic version VectorData<std::any> with the type registry via the REGISTER_SUBCLASS_IMPL macro. However, there are a few additional considerations to keep in mind with this approach, which is why in AqNWB we generally recommend the above approach using two classes instead.

  1. Preparing for Registration
    REGISTER_SUBCLASS_WITH_TYPENAME(VectorData<DTYPE>, "hdmf-common", "VectorData")
    #define REGISTER_SUBCLASS_WITH_TYPENAME(T, NAMESPACE, TYPENAME)
    Macro to register a subclass with the RegisteredType class registry.
    Definition RegisteredType.hpp:341
    We use VectorData<DTYPE> with the template parameter because we want to prepare all possible instantiations (e.g., VectorData<int>, VectorData<double>, etc.) for registration
  2. Actual Registration

    template<> REGISTER_SUBCLASS_IMPL(VectorData<std::any>)

    This performs the actual registration in the type system. We only register the most generic type (std::any) because in the NWB file, we only store namespace=hdmf-common and neurodata_type=VectorData (i.e., the NWB file doesn't have the notion of templates) and the generic type serves as the default registration. Note, that we need to use template<> as this is a template specialization.

    Note
    In C++ the implementation of templated classes is not easily separate into .hpp and .cpp files. However, the template<> REGISTER_SUBCLASS_IMPL(VectorData<std::any>) cannot be part of the .hpp file where the class is being defined. Also, the compiler will only include the call if the VectorData<std::any> is actually being instantiated. A simple instantiation of template class VectorData<std::any>; in the VectorData.cpp may not be sufficient for this. As a work-around, the template<> REGISTER_SUBCLASS_IMPL(VectorData<std::any>) may need to be placed in a different .cpp file that we know is going to be built (the need for this workaround is one reason why we recommend the two-class approach in AqNWB).
  3. Template Instantiation In VectorData.cpp:
    template class VectorData<std::any>;
    template class VectorData<uint8_t>;
    template class VectorData<int16_t>;
    // ... other types ...
    This is an optimization that pre-instantiates all the types we expect to use and makes these instantiations part of the AqNWB library. This allows users to use these types directly and prevents the compiler from having to generate these as part of the user's code build.

Limitations of REGISTER_SUBCLASS_WITH_TYPENAME

The main limitaton of the REGISTER_SUBCLASS_WITH_TYPENAME approach is that on read, AqNWB will use the default class associated with the neurodata_type. E.g., in the case of the ElectrodeTable class, by default the regular DynamicTable class will be used since that is what the schema is indicating to use. Similarly, for VectorData the default VectorData<std::any> will be used on read. To support reading using the more specific types, we can use the DEFINE_REGISTERED_FIELD macro to define read methods that will return the approbriate type, e.g.:

DEFINE_REGISTERED_FIELD(readElectrodeTable,
ElectrodeTable,
ElectrodeTable::electrodeTablePath,
"table with the extracellular electrodes")

in NWBFile to read the ElectrodeTable, or

readGroupNameColumn,
VectorData<std::string>,
"group_name",
"the name of the ElectrodeGroup this electrode is a part of")

in the ElectrodeTable to read the group_name column as VectorData<std::string> with the data type already specified as std::string at compile time.

Testing RegisteredTypes

As with all code, it is good practice to create appropriate unit tests to validate that our new class is functioning correctly. In the case of subclasses of RegisteredType we should pay attention to:

  • Test that the registration is being executed correctly and the type has been registered with the RegisteredType type registry, e.g. via:
    SECTION("test VectorData is registered as a subclass of RegisteredType")
    {
    // check that hdfm-common::VectorData is in the registry
    REQUIRE(registry.find("hdmf-common::VectorData") != registry.end());
    }
  • Test that RegisteredType::create works as expected for reading our new type, e.g. via:
    auto readDataUntyped = NWB::RegisteredType::create(dataPath, io);
    auto readVectorData =
    std::dynamic_pointer_cast<NWB::VectorData>(readDataUntyped);
    REQUIRE(readVectorData != nullptr);
  • Test that every read method created via the DEFINE_FIELD and DEFINE_REGISTERED_FIELD macros is working as expected, e.g. via:
    // Read the "namespace" attribute via the readNamespace field
    auto namespaceData = readVectorData->readNamespace();
    std::string namespaceStr = namespaceData->values().data[0];
    REQUIRE(namespaceStr == "hdmf-common");
    // Read the "neurodata_type" attribute via the readNeurodataType field
    auto neurodataTypeData = readVectorData->readNeurodataType();
    std::string neurodataTypeStr = neurodataTypeData->values().data[0];
    REQUIRE(neurodataTypeStr == "VectorData");
    // Read the "description" attribute via the readDescription field
    auto descriptionData = readVectorData->readDescription();
    std::string descriptionStr = descriptionData->values().data[0];
    REQUIRE(descriptionStr == description);
  • Test that the initialize method is working as expected (if included).