Details
-
New Feature
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
-
None
Description
Using JCas, it has been possible to have arbitrary Java objects included in the Java instance. The problem with doing this has been that there was no architected way for these objects to participate in the broader UIMA interoperability concepts such as serialization, remote annotators, etc. And, furthermore, JCas objects were optional, and might not be used.
UIMA V3 implements Feature Structures in the CAS as JCas objects directly, so these are now always present and reliable. This means that when an implementation adds arbitrary Java objects (e.g, a special HashSet containing Feature Structures) to a JCas class definition, they are reliably present.
Here's how we could make this all work in v3.
A user would first pick some Java class to emulate in the CAS. A requirement would be that the data in the emulated class would need to support having a serialized form representing a "snapshot" of the data at a particular moment, that could be put into the CAS using a fixed number of UIMA features of normal UIMA data types, including Feature Structures. For example, an ArrayList<FeatureStructure> could be put into the CAS as an FSArray instance of the current size; a Map<Integer, FeatureStructure> could be put into the CAS as an IntegerArray and an FSArray, etc. The snapshot would be produced whenever needed, for example, during serialization. A corresponding transformation (used, for instance, during deserialization) would convert the snapshot data back into the emulated Java class instance.
This new kind of hybrid object would be implemented with a custom JCas cover class which wrapped the emulated Java class instance. It would also have as features those needed for the "snapshot" representation.
The user would need to
- define a UIMA type; this type would include the feature definitions needed for the snapshot.
- create the corresponding JCas cover class for that type
- add 3 extra methods in the cover class, all methods defined by a new UIMA interface "UimaSerializable"
- _init_from_cas_data()
- _save_to_cas_data()
- clone
The _init_from_cas_data would use the cas data in this Feature Structure to initialize the emulated Java class.
This method would be called by the framework whenever it makes a new instance with non-empty Feature Structure data (for example, during deserialization), so that the emulated Java class instance may be initialized. This would typically be called by routines like the cas copier and deserialization.
Similarly, the _save_to_cas_data would be called by the framework as part of serialization, and would extract data from the emulated Java class and save as CAS features..
This Jira adds support for this approach; other Jiras will add some likely popular new types (example: FSArrayList - like ArrayList<TOP>). Users can (easily ?) add types of their own, for instance, if they need a peculiar kind of Set of Feature Structures, perhaps built on top of ConcurrentSkipListSet using a special definition of set-member-equals.