DataNucleus performance through the releases

Performance isn’t always the primary motivation when we are developing DataNucleus, but it always remains something that we bear in mind when introducing features or rationalising the APIs. As an interesting comparison of how performance has changed since version 3.x here we present the results of 3 “performance” tests that we have in the DataNucleus test suite.

Test 1 : Persist 2000 simple objects, and then start the timer. Then in a single thread, do the following 200000 times : get a PM, call pm.getObjectById on an id, 5 times per transaction, and close the PM. Stop the timer.

Test 2 : Persist 2000 simple objects, and then start the timer. Then in each of 10 threads, do the following 60000 times : get a PM, call pm.getObjectById on an id, 5 times per transaction, and close the PM. Stop the timer.

Test 3 : Start the timer. Get a PM, and start a transaction. Call pm.makePersistent on an A object, with a List containing 1 B object, and a single relation to a C object. Repeat the operation for 100000 times. Every 10000 objects call flush. Commit the transaction and close the PM. Stop the timer.

v5.1 : Test 1 – 10.0 secs, Test 2 – 10.0 secs, Test 3 – 10.5 secs

v5.0 : Test 1 – 10.0 secs, Test 2 – 10.0 secs, Test 3 – 13.0 secs

v4.1 : Test 1 – 10.0 secs, Test 2 – 10.0 secs, Test 3 – 13.0 secs

v4.0 : Test 1 – 10.0 secs, Test 2 – 10.0 secs, Test 3 – 13.0 secs

v3.2 : Test 1 – 16.5 secs, Test 2 – 20.5 secs, Test 3 – 13.0 secs

In earlier versions performance was less than 3.2, but involve having to go back to earlier JREs so data is not available for comparison. The timings were using an embedded H2 database, on an Intel core i5 with 8Gb RAM, running Linux.

So we have a significant improvement in going to v4.x in terms of allocation of PersistenceManager objects, as well as access to some common properties used by the PersistenceManager. And a further improvement in going to v5.1 in terms of persistence operations.

Clearly this is not a precise science and, as we have said in previous blog entries, a benchmark has to be representative of the operations that your application will be doing for it to mean something to you. If anyone has some simple tests that they want to see in our internal performance tests then you can easily contribute them.

 

Posted in Uncategorized | Leave a comment

DN v5.1 : Meta Annotations

With normal JDO or JPA usage you may have annotated a class like this

@Entity
@DatastoreId
public class Person { ... }

So we have a class that is JPA persistable, and is using the DataNucleus “datastore-identity” extension, providing a surrogate identity column in the datastore. This is fine, and easy enough. But what if you needed to put these 2 annotations on many classes? It would be simpler if you could define your own “composite” annotation that provided them more concisely. This is where we introduce meta-annotations.

If we define our own annotation like this

@Target(TYPE)
@Retention(RUNTIME)
@Entity
@DatastoreId
public @interface DatastoreIdEntity {}

You see that this annotation @DatastoreIdEntity provides both of our normal JPA annotations. So we can now annotate our JPA class like this

@DatastoreIdEntity
public class Person { ... }

Much simpler!

You can do the same thing with JDO annotations, as well as for annotations on fields/methods.

Note that this is new in DataNucleus v5.1, and to use the field/method level JDO/JPA annotations you will have to use updated (javax.) API jars, that will be provided with v5.1

Posted in Uncategorized | Leave a comment

DN v5.1 : Find by unique key

With JDO and JPA APIs you have the ability to find individual objects using their “identity”. This “identity” may be a field value for a single field that defines the primary key, or may be an identity object representing a composite primary key made up of multiple fields.

Some classes have other field(s) that are known to be unique for the particular class (termed in some places as “natural id”, or “unique key”), and so it makes sense to allow the user to find objects using these unique key(s). In the latest release (5.1.0 M2) we provide access to this mechanism. Be aware that this is a vendor extension and so you have to make use of DataNucleus classes (and hence not portable until included in the JDO / JPA specs).

Lets take an example, we have a class representing a driving license. We represent this with an identity that is a unique number. We also have a unique key that is what the driver is provided with.

With JDO the class is

@PersistenceCapable
public class DrivingLicense
{
    @PrimaryKey
    long id;

    String driverName;

    @Unique
    String number;
}

Consequently we can do as follows to get an object using its identity.

DrivingLicense license = pm.getObjectById(DrivingLicense.class, 1);

retrieving the license with id 1. If however we want to use the new retrieval via unique key we do this

JDOPersistenceManager jdopm = (JDOPersistenceManager)pm;
DrivingLicense license = jdopm.getObjectByUnique(DrivingLicense.class, 
    {"number"}, {"ABCD-1234"});

retrieving the license with number set to “ABCD-1234”. See the JDO docs.

 

Using JPA for the same example we have

@Entity
public class DrivingLicense
{
    @Id
    long id;

    String driverName;

    @Column(unique=true)
    String number;
}

To get an object using its identity we do

DrivingLicense license = em.find(DrivingLicense.class, 1);

and to get an object using its unique key we do

JPAEntityManager jpaem = (JPAEntityManager)em;
DrivingLicense license = jpaem.findByUnique(DrivingLicense.class,
    {"number"}, {"ABCD-1234"});

See the JPA docs.

Notes:

  1.  You can have as many unique keys as you want on a class, and this mechanism will support it, unlike with Hibernate “NaturalId” where you can only have 1 per entity.
  2. You use standard JDO/JPA annotations/XML to define the unique key(s), unlike with Hibernate “NaturalId” where you have to use a vendor specific annotation.
  3. If your unique key is made up on multiple fields then you simply specify multiple field name(s) to the second argument in the call, and multiple field value(s) to the third argument in the call.
Posted in Uncategorized | Leave a comment

@Repeatable annotations for JDO and JPA

DataNucleus now provides access to Java8 @Repeatable annotations for use with JDO and JPA. Previously, if you wanted to specify, for example, multiple indexes for a class using annotations, you would have to do it like this (for JDO) using a container annotation

@Indices({
    @Index(name="MYINDEX_1", members={"field1","field2"}), 
    @Index(name="MYINDEX_2", members={"field3"})})
public class Person
{
    ...
}

The JDO 3.2 annotations have now been upgraded (in javax.jdo v3.2.0-m6) to support the Java8 @Repeatable setting meaning that you can now do

@Index(name="MYINDEX_1", members={"field1","field2"})
@Index(name="MYINDEX_2", members={"field3"})
public class Person
{
    ...
}

The same applies to all standard JDO annotations that have a container annotation.

 

For JPA, the same is also now true (though clearly since Oracle seemingly doesn’t care one iota about pushing the JPA spec forward then this is not in the official JPA spec yet). However since DataNucleus provides its own “standard” javax.persistence jar , we have now published version v2.2.0-m1 of this jar adding support for @Repeatable just like with the JDO 3.2 annotations. So any annotation that has container annotation can now be repeated on a class/field/method, you just have to use v2.2.0-m1 of the DataNucleus javax.persistence jar. For example

@Entity
@NamedNativeQuery(name="AllPeople", 
    query="SELECT * FROM PERSON WHERE SURNAME = 'Smith'")
@NamedNativeQuery(name="PeopleCalledJones",
    query="SELECT * FROM PERSON WHERE SURNAME = 'Jones')
public class Person
{
    ...
}

 

Posted in Uncategorized | Leave a comment

DN v5 : Multi-tenancy improvements

JDO and JPA APIs don’t define any support for multi-tenancy, other than where you want to have 1 PMF/EMF per tenant and they have their own database or schema. DataNucleus introduced support for multi-tenancy using the same schema back in v4, whereby the tables that are shared will have an extra discriminator column which specifies the tenant the row applies to, and you specify a persistence property datanucleus.tenantId for the PMF/EMF you are using, defining the tenant it is for. This is fine as far as it goes, but requires that each tenant have their own PMF/EMF. DataNucleus v5 makes this more flexible.

The first change is that you can now specify that same persistence property on the PM/EM (pm.setProperty(…), em.setProperty(…)), so you can now potentially have a PM/EM for each tenant, and the data is separated that way via the tenancy discriminator as in v4. The use-case for this is where you have a web based system and each request has a user, so you create a PM/EM, set the tenant id based on the user, and then each database access will use the appropriate tenant.

PersistenceManage pm1 = pmf.getPersistenceManager();
pm1.setProperty("datanucleus.tenantId", "John");
... // All operations under tenant "John"
pm1.close();

PersistenceManager pm2 = pmf.getPersistenceManager();
pm2.setProperty("datanucleus.tenantId", "Gary");
... // All operations under tenant "Gary"
pm2.close();

The second change is that you can optionally also specify a MultiTenancyProvider, implementing this interface

public interface MultiTenancyProvider
{
     String getTenantId(ExecutionContext ec);
}

and specify the persistence property datanucleus.tenantProvider to point to an instance of your MultiTenancyProvider class. This means that, for example, if you have some session variable that identifies the user and want to share the PM/EM across your users, then you can use this provider instance to define the tenant id for each call. This feature is likely going to be much less useful than the different tenant per PM/EM but it is there for your convenience.

Posted in Uncategorized | Leave a comment

DN v5 : Improved support for Enum persistence

With standard JDO and JPA you can persist a Java enum field as either the name (String-based column) or the ordinal (Number-based column). This is great as far as it goes, so we can easily persist fields using this enum.

public enum Colour{RED, GREEN, BLUE;}

In some situations however you want to configure what is stored. Say, we want to store a numeric value for each enum constant but not the default values (for some reason). In DataNucleus up to and including v4 (for RDBMS only!) we allowed persistence of

public enum Colour
{
    RED(1), 
    GREEN(3), 
    BLUE(5);

    int code;
    private Colour(short val) {this.code = val;}
    public short getCode() {return this.code;}
    public static Colour getEnumByCode(short val)
    {
        case 1: return RED;
        case 3: return GREEN;
        case 5: return BLUE;
    }
}

and then you set metadata for each enum field of this type to specify that you want to persist the “code” value, like this

@Extensions({
    @Extension(vendorName="datanucleus", key="enum-getter-by-value", value="getEnumByCode"),
    @Extension(vendorName="datanucleus", key="enum-value-getter", value="getCode")
   })
Colour colour;

This was fine, except required too much of the user.

 

DataNucleus v5 simplifies it, and you can now do

public enum Colour
{
    RED(1), 
    GREEN(3), 
    BLUE(5);

    int code;
    private Colour(short val) {this.code = val;}
    public short getCode() {return this.code;}
}

and mark each enum field like this

@Extension(vendorName="datanucleus", key="enum-value-getter", value="getCode")
Colour colour;

We don’t stop there however, because the value that is persisted using this extended method can now be either short, int or String. It is also now available for use on RDBMS, Cassandra, MongoDB, HBase, Neo4j, ODF, Excel, and JSON, so your code is more portable too!

Posted in Uncategorized | Leave a comment

DN v5 : Support for Java8 “Optional”

Java 8 introduces a class (java.util.Optional) that represents either a value or null. DataNucleus v5 has a requirement of Java 8, and adds support for persisting and querying fields of this type. Let’s suppose we have a class like this

public class MyClass
{
    private Optional<String> description;
    ...
}

By default DataNucleus will persist this as if it was of the generic type of the Optional (String in this example). Consequently on an RDBMS we will have a column in the datastore of type VARCHAR(255) NULL. If the field description represents a null then it will be persisted as NULL, otherwise as the value contained in the Optional. It supports the generic type being a normal basic type (though not a collection, map, array), or a persistable object.

JDOQL in JDO 3.2 adds support for querying the Optional field, like this

Query q = pm.newQuery(
    "SELECT FROM mydomain.MyClass WHERE description.isPresent()");

So this will return all instances of the class where the description field represents a value. Similarly we can return the value represented by the Optional field, like this

Query q = pm.newQuery(
    "SELECT description.get() FROM mydomain.MyClass");

As you can see, JDOQL makes use of standard Java method namings for accessing the Optional field.

For JPA querying, you can simply refer to the Optional field as if it was of the generic type represented.

This is part of the JDO3.2 spec and you can make use of it in DataNucleus v5+. It is not currently part of the JPA2.1 spec, but you still can make use of it in DataNucleus v5+ when using JPA.

Posted in Uncategorized | Leave a comment