In April 2009 Google announced AppEngine for Java, providing support for JDO and JPA. Since that time people have had the chance to see what they think of it and identify shortcomings. This has lead to many misconceptions of JDO and JPA (though mainly JDO) due to thinking that what Google provide is a full and true representation of these standards; it isn’t. This blog entry attempts to correct some of these misconceptions, and to suggest some areas where Google could remedy the situation. All of the following items would help in providing a true reflection of these persistence standards and aid GAE/J users in having (much more) portable applications.
JDO/JPA : Lack of support for many methods and operators in JDOQL/JPQL
GAE/J only supports a relatively small subset of the available methods and operators of JDOQL/JPQL. This is due to the underlying datastore not supporting certain capabilities in its queryability. The problem is that this gives the impression that JDOQL/JPQL are somehow weak. What would make way more sense would be to evaluate all that is evaluatable in the datastore, and then have a flag set while compiling for whether the query contains any feature that the datastore query cannot handle and, if so, run the DataNucleus in-memory query evaluator on the resultant instances. This would provide a transparent interface to JDOQL/JPQL and mean that people don’t have to have build some custom queries just to get around GAE/J shortcomings.
JDO : No support for input candidate collection for JDOQL
GAE/J does not support inputting a candidate collection and query over that collection. This is a trivial thing to support, particular since the code necessary to do it was contributed some time ago, yet isn’t in the current plugin.
JDO : pm.getExtent doesn’t handle subclasses
The current pm.getExtent() implementation in GAE/J doesn’t support the subclasses flag. The root cause is that the underlying datastore doesn’t support a single query to retrieve a class and its subclasses. The simple solution would be to run “n” queries on the datastore, one for each of the possible subclass types, and merge the results. This would be simple to do and implement, and would provide correct JDO behaviour so users don’t see any shortcoming.
JDO/JPA : exposing Google-specific id classes
With GAE/J there is some flexibility on what is allowable as a PK field; Long, String, and “Key”. The first two are standard classes and standard JDO. The latter is environment specific. Firstly GAE/J ought to allow short/int/Short/Integer/long for true JDO/JPA operation so that users see no difference. Secondly, the “id” exposed to the user should be a JDO or JPA id and should be portable. When we implemented support for persisting to db4o we didn’t expose db4o’s id, instead wrapping it internally, so the user has no unexpected classes popping up that prevent their migration elsewhere later on.
JDO/JPA : one entity group per transaction
With GAE/J you have a restriction on the number of “entity groups” that can be enlisted in any transaction. Ok, but why expose this to the user and restrict what they do ? The logical way to do it would be to have multiple “internal” transactions for a JDO/JPA transaction and have each of these for a particular entity group. Since the underlying datastore doesn’t provide ACID transactions anyway there is little impact of doing this. It would then mean that you don’t impose on users having to split their persistence code apart just to get it to run, and hence mean that it is portable
This term is being used in GAE/J seemingly where you have a Collection and so no real relation, although the “id” relate to other objects. This is perfectly representable in JDO/JPA as a Collection, or Collection or even Collection. With one of these “relations” the onus is on the user to manage the relation.
Support for types persistable as String
DataNucleus has, for some time, provided a mechanism for defining how to persist a type as a String (and retrieve its value from the String) – see ObjectStringConverter in DataNucleus “core” code. GAE/J could easily provide support for this in their plugin (if a type is not natively supported then check if there is an ObjectStringConverter and use that) and this would mean that many more Java types are persistable using AppEngine.
Documentation : @Persistent
In the GAE/J docs, every field has @Persistent marked against it. This is totally unnecessary, and you only need @Persistent for a non-standard field type. It leads to people believing that you must specify this to get something persisted, and so when they want to have a field not persisted they just remove the annotation. Please update the docs to reflect the minimal configuration required so we give a fair reflection of JDO and its spec. For example
public class MyClass
Package naming “org.datanucleus.*”
This plugin is provided by Google not DataNucleus. It’s currently packaged as “org.datanucleus.store.appengine”. This leads to people believing that DataNucleus itself is at fault for its shortcomings. This is unacceptable and we own the domains datanucleus.org/datanucleus.com. Please rename your packages ASAP.
Nowhere have we seen any attribute of the GAE/J BigTable datastore that cannot be handled by the JDO or JPA API’s. The JDO API (and metadata) in particular was designed as generic, and there is nothing in a “NoSQL” datastore that should cause it any problems with representation. We challenge anyone to define where there is such a problem area and it can then be addressed (there’s a JIRA open on the Apache JDO project for just this situation); if you really can come up with a problem area then its in all of our interests to understand it and tackle it.