Saturday, 12 July 2008

Primary Keys that mean something in Rails

I like Rails, I really do, but you like me may need to migrate a data-model that doesn't use locally generated integers as the primary keys (PKs) of entities.

Rails, by default, seeds model (entity) data with unique identifiers it uses as PKs from database specific mechanisms (say serial columns in PostgreSQL).

Personally, I think locally generated integers used as identifiers for "things", are most poor for two main reasons:

  1. They don't scale.  Your local postgreSQL instance may just be part of a wider data architecture.  If you use locally generated PKs as Rails will get you to, you'll clash with other locally (but possibly the same value) PKs from other local DBs when synchronization with the "master" takes place.  If you don't know what I'm talking about then go look at scaling databases, its not just a postgreSQL issue.  If you really really have/want/need to use an identifier, then there is a good article of using Universal Unique Identifiers (UUID) for IDs at GUID-as-Primary-in-Rails although the article doesn't follow through on how in precise detail, so we'll cover that in a later blog.
  2. They don't mean anything.  Say for example you have an entity type HillType, whose PK should really be a unique and meaningful name.   If you really wanted to find out about the details of a HillType of name gnarly you'd want to enter a RESTful-like URL of http://localhost:3000/hilltypes/gnarly.  You really wouldn't want to know the mapping between "gnarly" and some internal ID that Rails generated for you via a DB sequence.  Yes, I know there are ways of RESTfulizing the internal ID to some other attribute on the model/table, but avoid the complexity in the first place, and under-populate your database with data it actually needs to be readable by people not frameworks.

So ..........  After those contentious points, how do we do it?

Three steps to RAILS CRUD with meaningful PKS

1) Unfortunately, I've not found a way round of avoiding an integer PK in Rails, unless either you specify your migration without a PK, or you avoid the Rails meta-DDL completely and go native in your migration as in here:-

class CreateHillTypes < ActiveRecord::Migration
  def self.up
      # -------------------------------
      # This is postgreSQL specific DDL
      # -------------------------------
      execute <<-EOF
        create table public.hill_types (
            typename varchar(255) not null unique,
            description varchar(2000),
            primary key (typename)
        );
      EOF
  end

  def self.down
    drop_table "hill_types"
  end
end

2) We also need to ensure our Rails "knows" we've provided a non-standard PK, so we need to amend our model:-

class HillType < ActiveRecord::Base
    set_primary_key "typename"
end

3) Next we need to amend the templated controller that script/generate scaffold HillType generated for us to avoid attempting to mass-assign what is now a protected attribute (typename) on our model.  So, on the create method Rails makes for you by default in your model controller, you'll need something like:

def create
      @hill_type = HillType.new
      got_details = params[:hill_type]
      @hill_type.typename = got_details["typename"]
      @hill_type.description = got_details["description"]
      respond_to do |format|
          if @hill_type.save
              flash[:notice] = 'HillType was successfully created.'
              format.html { redirect_to(@hill_type) }
              format.xml  { render :xml => @hill_type, :status => :created, :location => @hill_type }
          else
              format.html { render :action => "new" }
              format.xml  { render :xml => @hill_type.errors, :status => :unprocessable_entity }
          end
      end
  end

That is it.  Go gambol in the fields of  meaningful URLs, and leave out meaningless framework convenience identifiers from your models, they are embarrassing.

Thursday, 3 July 2008

Rails, XHTML and using your own CSS styles

Got to migrate your CSS and your standard layouts to rails? Read on ......

A Ruby on Rails (RoR) app holds three key locations you'll need to know about to re-use your imagery, your Cascading Style Sheet (CSS) look and feel, and your standard layouts.

  1. <project_name>/public/images: Put your logos and your bits and pieces in terms of iconography in here.
  2. <project_name>/public/stylesheets: Put your CSS files (holding all your standard page styling) in here.
  3. <project_name>/app/views/layout/application.html.erb: This is where we can define our standard layout to apply for our pages.

Standard Layout

So, for example, my application.html.erb looks like this:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
    <html>
    <!-- This is a standard wrapper for all view content -->
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
        <title>Sample Rails common layout wrapper</title>
        <!-- Include all our styles as pulled from styles.css -->
        <%= stylesheet_link_tag 'styles' -%>
        <!-- Include all the standard Rails javascript includes -->
        <%= javascript_include_tag :defaults -%>
    </head>
    <body>
        <div id="wrapper">
            <!-- The page content -->
            <div id="content">
                <!-- Standard header to page -->
                <div id="header">   
                    <div style="float:left; margin-top:15px;">
                        <a href="http://timepoorprogrammer.blogspot.com">
                            <img alt="" src="images/author.png" style="border-style: none"/>
                        </a>
                    </div>
                </div>
                <!-- The content that can we'll change dynamically later -->
                <div id="dynamicPanel">
                    <!-- Yield to whatever local content Rails expects -->
                    <%= yield -%>
                </div>
                <!-- Standard footer to page -->
                <div id="footer">Copyright &copy; 2008 me</div>
            </div>
        </div>
    </body>
</html>

There are a few things here:

  1. The doc type is xhtml.  If you open this up in Firebug or in HTML Tidy from your Firefox browser (get it if you haven't already done so), they'll both tell you this is a full on dynamic HTML page.
  2. The javascript_include_tag is embedded Ruby that ensures your pages include the standard JS files that come with RoR 2.1
  3. The style_sheet_link_tag points off to the public/stylesheets location where you put your CSS.
  4. The example img tag points off to the public/images location, yet doesn't need the public in the actual definition
  5. The <%= yield -%> marker tells RoR to put the content of your views you may have defined or will be defining (under  app/views/<controller_name> within the standard RoR structure) .

Now go restart your rails, and have a look at your views now, and bask in the fact that they'll be using your styles and common layout now.

Note: If you are puzzling over what app/views/controller_name means and aren't sure what the names of RoR files should be to express, say controllers, views, models, and how they relate to your RoR file naming conventions and URLs, go look at http://peepcode.com/products/rails-from-scratch-part-i for a particularly good introduction.

Rails and PostgreSQL for beginners

I've read a few blogs out there on using Ruby on Rails with the very excellent database postgreSQL.  But, I'd not found more than a do the installations and off you go kind of thing.  Given most of the online tutorials assume you click on "About your application's environment" to ensure all your rails malarky is working okay, I thought it sensible to tell you how.

Note: This blog assumes you have the free Aptana studio for RadRails installed, ruby installed on windows, both the rails and the postgres-pr gems installed for ruby, and you've got Postgresql up and running on your local box.  Honestly, there are
a load of online examples of installing Ruby, Rails, Aptana studio, and postgresql for Windows.  I'll not cover them here.

1) Start your postgresql server.

2) Create a directory.

3) Use Aptana Studio to point to this directory location, choosing postgresql as the DB of choice, and Aptana will create the basic sub-directory setup for rails you'll need. 

4) It will also start the rails server. 

5) About this time you'll choosing to view "About your application's environment" to find out what you've got.  This will fail.

Why?, well when you create a default rails project in Aptana Studio, or any other tool that creates the rails structures you get a default database.yml in your top level directory, that you can use to setup your DBs with this project.

But, its default content.

So ................

a) Shutdown your rails server from the Aptana Studio server's tab, or however else you do it.

b) Create a file called newdb.bat in your top level directory of your new project

Its contents should look alot like:-

    psql -h localhost -U postgres <db\create.sql
call rake db:migrate

This allows you to logon to postgres from the windows command line to create the databases the project requires as defined in an SQL batch file.

c) This SQL batch file is called create.sql in your local project db directory, which should look alot (substitute the your_project and your_username, and your_password markers with your own details) like this (in Postgresql format):-

    /* Drop and re-create the development database */
    drop database if exists <your_project>_development;
    create database <your_project>_development
        with owner = postgres
        encoding = 'UTF8'
        tablespace = pg_default;

    /* Drop and re-create the test database */
    drop database if exists <your_project>_test;
    create database <your_project>_test
        with owner = postgres
        encoding = 'UTF8'
        tablespace = pg_default;

    /* Drop and re-create the production database */
    drop database if exists rails_from_scratch_part_one_production;
    create database <your_project>_production
        with owner = postgres
        encoding = 'UTF8'
        tablespace = pg_default;

    /* Drop the user if they exist, and re-create them */
    drop user if exists <your_username>;
    create user <your_username> with password '<your_password>';

    /* Grant the user all privileges on the databases */
    grant all privileges on database <your_project>_development to <your_username>;
    grant all privileges on database <your_project>_test to <your_username>;
    grant all privileges on database <your_project>_production to <your_username>;

5) Go to the windows command line.  Navigate to where your newdb.bat file lives, and run it from there.  This will setup three project databases for you in Postgresql, one for development, one for test, and one for production.

6) If you have properly installed Postgresql properly, you can check the DB content from your PgAdmin console.  With Rails
2.1.0 this will create a table called schema_migrations in each of your three databases.  We'll come back to this soon once we've got tables to make.

6) Refresh your project in Aptana studio or whatever.  Now restart your server from the server list tab, or however you do it
and THEN click (or in my case double click) on "About your Application's environment". 

Hey Presto, you will now be presented with the environment, including the development database in which your project resides.

You are now up and running with postgresql and rails.

Monday, 19 May 2008

Finalising WebSVN setup on Windows XP with Cygwin

You've got your WebSVN up and running, but you've not got all the features.  Why not? This is because the UI depends on a few programs that usually come with Linux/Unix distributions, and you'll need to make these available and turn these on for Windows XP.

So, the final step on Windows XP with WebSVN is to ensure all the tools are available, and here Cygwin will help you out.

I reckon, Cygwin is something you should probably have on your Windows box anyway, as it allows you to jump between a Unix-like environment and the Windows environment, doing the meaningful stuff as the problem dictates.  Any decent programmer should not need to choose between either flavour of OS tools, justset your box up for both.

You can pull Cygwin from http://www.cygwin.com/, and the key to setting it up so it has the tools WebSVN needs, is making sure you know how to use the setup installer to pick the right packages.

WebSVN relies (for its full features) on the following programs being available to the web app users via its file websvn/include/config.php:

  1. diff: With this we can enable the full diff functionality for versions of files.
  2. tar and GZip: With these two we can enable the tarball functionality that allows web users to get a ZIP/TAR of your code.
  3. enscript and sed: With these two we can enable syntax highlighting on the code in the repositories.  The type of files is defined in websvn/include/setup.php.  You might (like I did) add a couple of entries here for JSP and XML files (say fix them to type HTML) as otherwise these types will be interpreted as plain text which is not great, although this is not something you have to do to get this working.  Its a finishing touch.

Okay. From the cygwin setup program you get when you download and install it, you specify the set of packages/programs you'll want to include or add to your new or current Cygwin setup.  You'll need to include:

  1. The package diffutils you'll find in the utils section of the cygwin setup program once launched.
  2. The packages sed, tar, and gzip you'll find in the base section of the cygwin setup.
  3. The package enscript you'll find in the text section of the cygwin setup.

Okay, that's them done.  Clicking thru the cygwin setup will ensure these are installed.

Once done, you'll need to configure WebSVN to use these tools.

Go edit websvn/include/config.php under your htdocs location, and uncomment the appropriate lines to point to where these tools now reside.  My local file looks like this (it depends on where you installed cygwin too of course):

// We are not a linux box so we need to use
// cygwin toolkit exes
 
$config->setDiffPath("C:\\cygwin\\bin");
$config->setTarPath("C:\\cygwin\\bin");
$config->setGZipPath("C:\\cygwin\\bin");

// For syntax colouring, if option enabled...
$config->setEnscriptPath("C:\\cygwin\\bin");
$config->setSedPath("C:\\cygwin\\bin");

and make sure you enable the download feature by uncommenting:

$config->allowDownload();

and enable the colourisation/syntax highlighting of files by uncommenting:

 $config->useEnscript();

Now go startup the svn service, and startup apache from your XAMPP console (see previous blogs).

Go view the results, as now you'll have syntax highlighted, downloadable, and diff-able code at your local repo home http://localhost/websvn.

Done.

Wednesday, 14 May 2008

Pretty-up your SVN repositories via the web with XAMPP

Note: Much of this is taken from the great blog entry at http://turnleft.inetsolution.com/2007/07/how_to_setup_subversion_apache_1.html, but with an XAMPP slant, so apologies for any repetition here.

No time?, read on:

1) Get and install XAMPP 1.6.3a as it comes with Apache 2.2.x.  This is big enough to be the topic of a separate thread.  But you shouldn't go far wrong if you follow the guide at http://www.apachefriends.org/en/xampp-windows.html.

2) Augment the XAMPP version of apache 2.2 with the Subversion specific libraries needed, available in the file svn-win32-1.4.6.zip you can get at http://subversion.tigris.org/servlets/ProjectDocumentList?folderID=91&expandFolder=91&folderID=74.  Do the following:

    a) Stop apache from your XAMPP control panel

    b) XAMPP has invalid pre-provided files mod_dav_svn.so and mod_authz_svn.so modules in xampp\apache\modules, replace them with the correct ones you'll find in the zip.

    c) XAMPP may come with invalid DLL for svn. Just in case, replace xampp\apache\bin\libdb44.dll and xampp\apache\bin\int13_svn.dll with the ones you'll find in the zip.

3) Now, configure your repository for web access via Apache.

    a) Edit C:\xampp\apache\conf\httpd.conf and add:

        Include conf/extra/httpd-subversion.conf

    b) Create http-subversion.conf in the extra subdirectory. 
    c) Populate it with details of your repository, and ensure the amended modules get loaded:

        LoadModule authz_svn_module modules/mod_authz_svn.so
   LoadModule dav_svn_module modules/mod_dav_svn.so

        <Location /svn/prototypes>
            DAV svn
            SVNPath c:/svn/prototypes
            AuthType Basic
            Options FollowSymLinks
            order allow,deny
            allow from all
            AuthName "prototypes"
            AuthUserFile c:/svn/passwords
            Require valid-user
        </Location>

4) Create the common web access password file for the repositories, and add a user.

    <path_to_htpasswd_under_xampp> -cb <path_to_svn_root_directory_less_drive_handle> <username> <password>

    e.g.

    c:\xampp\apache\bin\htpasswd -cb \svn\passwords whoever weRst194UUd

5) Check how we are doing by viewing the repositories over webDAV, by starting Apache again from the XAMPP control panel, and view the repository at http://localhost/svn/<repository_name>.  You will now need to use the user name and password to access your repository.

Note: So we are talking two password files, one under C:\<svn_root>\<repository_name>\conf\passwd for configuring SVN repository users who do checkouts etc etc on a particular repository, and one for all the online users who can browse repositories in C:\<svn_root>\passwords via using htpasswd generation above.

6) But wait, what about the pretty-up bit you promised? For this you'll need to stop using straight webDAV to display your repositories as its ugly.

    a) Note the "extra" stuff we added to the httpd-subversion.conf:-

        Options FollowSymLinks
        order allow,deny
        allow from all

    This is for our choice of repository web front end, WebSVN 2.0. 

    b) Ensure XAMPP Apache has the full PHP support needed for our choice of repository web front end.

        i) You can get PHP from http://www.php.net/downloads.php, go for the 5.2.6 windows installer, as its got the fixes missing in the build a few days earlier.

        ii) During install, select the Apache 2.2.x Module that allows the installer to update your httpd.conf file for XAMPP with the appropriate settings.

    c) Download the most recent ZIP package of WebSVN 2.0 from http://websvn.tigris.org/servlets/ProjectDocumentList. Unpack the files into xampp\htdocs and rename to websvn

    d) Finish the job by configuring WebSVN.  Rename xampp\htdocs\websvn\include\distconfig.inc to config.inc, and tell WebSVN its dealing with a windows host, and where the original svn "root" location under which all your repositories live happens to be, so uncomment and amend the entries:

        $config->setServerIsWindows();
        $config->parentPath("c:\\<path_to_your_svn_root>");

    e) Restart Apache, and you should now be able to access http://localhost/websvn to see all the repositories under your svn root using the nice display from the WebSVN people.

Done.

Tuesday, 13 May 2008

Subversion on Windows XP

This entry is if you need to setup a subversion code repository for managing your code on a Windows XP box:


1) Run the EXE (svn-1.4.6-setup.exe found at Get the EXE to install Subversion

2) Issue the repository creation commands from the Windows command line wherever you want to make your repository:
a) mkdir svn
b) svnadmin create c:<path to svn>\repos to create the fundamental repository structure

3) Create the subversion Windows service

sc create svn binPath= "\"C:\Program Files\Subversion\bin\svnserve.exe\" --service -r C:\<path to svn>" DisplayName= "Subversion Server" depend= Tcpip start= auto

4) Start the subversion Windows service, either from the command line with "net start svn" or from the services listing in Control Panel.

5) Go and remove the anonymous access from the repository conf directory file C:\<path to svn>\repos\conf\svnserve.conf, uncomment these lines and change anon-access to none:

anon-access = none
auth-access = write
password-db = passwd
realm = My Subversion Repository

6) Add some users who can get at the repository, by editing C:\<path to svn>\repos\conf\passwd, in the format:

username = userpassword 

7) Check the service is working from some directory from the windows command line tool using the command:

svn checkout svn://localhost/repos

It should ask you for your password, and will create your local copy of repos

Done.

Friday, 9 May 2008

Easy Blogging

Need a quick way to post your blog?

No time to write custom HTML that fits the blog provider templates?.  No time to battle with say the blogger.com posting editor's occasional habit of inserting HTML where you don't want it to, or saving when you don't want it to?. No time to figure out what fonts, layout, and setup you need?

No problem.

Download the latest version of Windows Live Writer. It will walk you through the install process, asking you where your blog is, picking up the styles and layout it needs from blogger.com.

This will configure the Weblog settings.

Next open Live Writer, write your entry, and hit Publish.

Done.

Tuesday, 6 May 2008

How to JPA outside your JEE container: Part 1: The Basics

1) Introduction

Over the past 10 years, numerous persistence frameworks have been developed for Java.  Each of these frameworks have lead to various implementations, each of which have addressed particular issues well, but this stew of activity has resulted in many different, and divergent paths to solve what is a common problem, persisting data to back-end storage.

The Java Persistence API (JPA) attempts to standardize this multiplicity. 

Implementations of the JPA standard are already available from a variety of providers, including Sun (Glassfish JPA), BEA (Kodo), Oracle (TopLink), RedHat/JBoss (Hibernate EntityManager), IBM Websphere 6.1 (EJB3 service pack since December 2007), Tangasol Coherence (Cache Store), OpenJPA, and JPOX, to name a few.

This blog entry is the first in a series covering the basics of JPA, ready for the next entries which will cover implementation of JPA outside a JEE container, and the final entry caching and database HA with the JPA.

2) Key Concepts

There are several key concepts to cover when using JPA to persist and retrieve data.  This section briefly discussed each in turn.

Skip this blog entry if you know your JPA already.

All of the concepts identified here can be read up on in detail in EJB3 in Action chapters 6 to 11, and in extreme detail in the whole of Java Persistence with Hibernate.

The Entity Manager

A Java application using the JPA does not interact with data storage directly via JDBC connections instead the main interface any layer above the JPA uses is the EntityManager, as defined at http://java.sun.com/javaee/5/docs/api/javax/persistence/EntityManager.html.

The EntityManager interface is used to support Create, Retrieve, Update, and Delete (CRUD) data actions and custom query actions on object data.

Probably the clearest description of the purpose of the methods on an EntityManager instance is in Chapter 9 Manipulating Entities with EntityManager of Java Persistence with Hibernate.

Here, we’ll use example code to clarify the concepts this reference discusses.

Here is example usage of an (out-of-container, namely outside JBoss) EntityManager instance.  Here an EntityManager is used to save the details of a Book:

        void addBook(Book a_Book) throws MyException
    {
        EntityTransaction currentTrans = null;
        final EntityManager em = super.getEntityManager();
        try
        {
            currentTrans = em.getTransaction();
            currentTrans.begin();
            em.persist(a_Book);
            currentTrans.commit();     
        }
        catch (Exception anyE)
        {
            if (currentTrans.isActive())
            {
                currentTrans.rollback();
            }
            throw new MyException(errorMessage, anyE);
        }
    }
Within the example, the call to the method persist is used when adding new data via the EntityManager instance.

Which EntityManager method is applicable for what purpose needs to be explained by reference to the life-cycle of a typical JEE Entity.  This is identified in the next section.

Note: For JavaDoc for the persistence API look under the package javax.persistence at http://java.sun.com/javaee/5/docs/api/ .  

Entities

An application views or traverses data in data storage indirectly, via abstractions known as JEE Entities.

Annotations

JEE Entities and the relationships between them are defined by annotations applied within Java classes, and these Entities are used to map to the relational schema of the data storage.

From the example code in the previous section, the book entity (Book.java) is defined by the following code, which exhibits common annotations which you will see repeated throughout the entity definitions within the example application:

package com.whatever.example.dao.entities;

import com.whatever.dao.UniqueIDGenerator;
import java.util.Date;
import java.util.HashSet;
import java.util.Set;
import javax.persistence.CascadeType;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.FetchType;
import javax.persistence.Id;
import javax.persistence.JoinColumn;
import javax.persistence.ManyToOne;
import javax.persistence.OneToMany;
import javax.persistence.Table;

/**
* Book Entity
*
* @author jball
*/
@Entity
@Table(name = "book", schema = "public")
public class Book implements java.io.Serializable
{
    /** Size of DATE field */
    private static final int SIZE_OF_DATE_FIELD = 29;
    /** Size of BIG field */
    private static final int SIZE_OF_BIG_FIELD = 2000;
    /** Book UUID */
    private String m_Bookuuid;
    /** Category */
    private Category m_Category;
    /** Library */
    private Library m_Library;
    /** title */
    private String m_Title;
    /** author */
    private String m_Author;
    /** published */
    private Date m_Published;
    /** description */
    private String m_Description;
    /** on loan records */
    private Set<OnLoan> m_OnLoans = new HashSet<OnLoan>(0);

    /**
     * Default empty constructor
     */
    public Book()
    {
    }
    /**
     * Kitchen sink constructor.  Note the entity makes its own
     * identifier which is a distributed storage safe UUID.
     *
     * @param a_Title title
     * @param a_Author author
     * @param a_Published published date
     * @param a_Description description
     * @param a_Category category
     * @param a_Library library
     */
    public Book(String a_Title,
                String a_Author,
                Date a_Published,
                String a_Description,
                Category a_Category,
                Library a_Library)
    {
        this.m_Bookuuid    = UniqueIDGenerator.getID();
        this.m_Title       = a_Title;
        this.m_Author      = a_Author;
        this.m_Published   = a_Published;
        this.m_Description = a_Description;
        this.m_Category    = a_Category;
        this.m_Library     = a_Library;
    }
    /**
     * Get the PK
     *
     * @return PK
     */
    @Id
    @Column(name = "bookuuid", unique = true, nullable = false)
    public String getBookuuid()
    {
        return this.m_Bookuuid;
    }
    /**
     * Set the PK
     *
     * @param a_Bookuuid PK
     */
    public void setBookuuid(String a_Bookuuid)
    {
        this.m_Bookuuid = a_Bookuuid;
    }
    /**
     * Get the book category
     *
     * @return Book Category
     */
    @ManyToOne(fetch = FetchType.LAZY)
    @JoinColumn(name = "categoryname", nullable = false)
    public Category getCategory()
    {
        return this.m_Category;
    }
    /**
     * Set the Book Category
     *
     * @param a_Category Book Category
     */
    public void setCategory(Category a_Category)
    {
        this.m_Category = a_Category;
    }
    /**
     * Get the book's library
     *
     * @return Library that holds book
     */
    @ManyToOne(fetch = FetchType.LAZY)
    @JoinColumn(name = "libraryuuid", nullable = false)
    public Library getLibrary()
    {
        return this.m_Library;
    }
    /**
     * Set the book's library
     *
     * @param a_Library Library
     */
    public void setLibrary(Library a_Library)
    {
        this.m_Library = a_Library;
    }
    /**
     * Get the book title
     *
     * @return Book title
     */
    @Column(name = "title", nullable = false)
    public String getTitle()
    {
        return this.m_Title;
    }
    /**
     * Set the book title
     *
     * @param a_Title book title
     */
    public void setTitle(String a_Title)
    {
        this.m_Title = a_Title;
    }
    /**
     * Get the book author
     *
     * @return book author
     */
    @Column(name = "author", nullable = false)
    public String getAuthor()
    {
        return this.m_Author;
    }
    /**
     * Set the book author
     *
     * @param a_Author Book Author
     */
    public void setAuthor(String a_Author)
    {
        this.m_Author = a_Author;
    }
    /**
     * Get the published date
     *
     * @return published date
     */
    @Column(name = "published", nullable = false,
            length = SIZE_OF_DATE_FIELD)
    public Date getPublished()
    {
        return this.m_Published;
    }
    /**
     * Set published date
     *
     * @param a_Published published date
     */
    public void setPublished(Date a_Published)
    {
        this.m_Published = a_Published;
    }
    /**
     * Get the book description
     *
     * @return book description
     */
    @Column(name = "description", nullable = false,
            length = SIZE_OF_BIG_FIELD)
    public String getDescription()
    {
        return this.m_Description;
    }
    /**
     * Set the book description
     *
     * @param a_Description book description
     */
    public void setDescription(String a_Description)
    {
        this.m_Description = a_Description;
    }
    /**
     * Get the on loan details for the book
     *
     * @return on loan details
     */
    @OneToMany(cascade = CascadeType.ALL, fetch = FetchType.LAZY,
               mappedBy = "book")
    public Set<OnLoan> getOnLoans()
    {
        return this.m_OnLoans;
    }
    /**
     * Set the on loan details
     *
     * @param a_OnLoans on loan details
     */
    public void setOnLoans(Set<OnLoan> a_OnLoans)
    {
        this.m_OnLoans = a_OnLoans;
    }
}

There are several key features in this code:

  • The @Entity annotation marks the class definition as a JEE Entity which can be managed via the JPA EntityManager.
    The @Table annotation is not mandatory.  It is suggested you use it, as it ensures any RDBMS table creation actions the JPA EntityManagerFactory may perform on your behalf (see the persistence unit section below) results in the correct mapping between the object layer and the relational layer.
  • The @Id field is used to identify the unique identifier or primary key for an instance of entity type Book.  In our example code here, the code uses a custom UUID generator for the primary key.  This is a by-product of our example also being used for a multi-database instance example illustrating HA-JDBC failover which will be described in a later blog.  There are many strategies that can be used to generate the unique ID, either your own custom code, or get the JPA provider to do it for you, or rely on the background data storage to do it for you (say via a database sequence). 

Note: The annotations that can be used for ID generation are discussed in some detail in section 4.2.3 Database Primary Keys of Java Persistence with Hibernate.  Make sure you choose an ID generation strategy that scales and de-couples the JPA from the DB implementation properly; we reckon the less dependent the method is on a particular database implementation the better, as otherwise you can end up in danger of producing a JPA persistence layer that only works with one type of database, which misses the point of the JPA.

  • Every attribute you want stored against an entity needs a setter and a getter method.  These take the form of setters and getters throughout Java, namely get<Attribute> and set <Attribute> where <Attribute> corresponds to the attribute name starting with an upper case letter.  You can leave it to the JPA to decide what the actual database column names should be, but, in our pedantic example the @Column annotation specifies all the database column characteristics we consider important, all of them.
  • The relationship between one entity and another can be expressed via the annotations @OneToOne, @OneToMany, @ManyToOne, and @ManyToMany.  The @OneToMany annotation in the example code above indicates there is a relationship between a Book entity and an OnLoan Entity, where a book may have been on loan to many different members over any particular time frame.  Be advised that there are many different argument permutations allowable for these relationship annotations.  In the example here, the arguments to the annotation indicate the cascade type is ALL, and the fetch type is LAZY.  This means if we delete a Book entity, the JPA will cascade delete any OnLoan records that refer to the Book, and the LAZY means the JPA will only perform SQL required to retrieve OnLoan records when an application with a handle on the Book object actually invokes the getOnLoans method.
  • All the JPA entities should implement Serializable to avoid “between-tier” transport issues in multi-tier multi-box applications.
  • All the JPA entities should contain a default empty constructor.  We’ve discovered that this constructor is used behind the scenes by the Hibernate JPA implementation to help maintain the correct entity content.  Without it, entities do not get properly hydrated by the Hibernate JPA provider on data retrieval.

Lifecycle

All Entities within the JPA architecture have a life-cycle, best described in section 9.1: The persistence lifecycle in EJB3 in Action.  This life-cycle is what explains the methods on an EntityManager instance.  We'll cover it really briefly here, but please refer to the reference documentation for detail.

Entities start in a New state, and this state represents their condition just after construction.
When an entity is passed to the persist method on an EntityManager it moves to the Managed state, and the data representation of it is scheduled to be added to data storage. Managed entities are also returned from an EntityManager that executes either a query or the find operation.

In this managed state, an application can navigate the relationships expressed in the entity method signatures, so for example the call getOnLoans on an object of type Book would result in the JPA retrieving the OnLoan records for the book.

When an entity moves outside of the scope of the EntityManager that is looking after it (which can occur if the EntityManager has been closed, the entity has been passed between tiers in a multi-tier application, or the entity has been passed between threads) the entity will move to the Detached state. 

You can amend the content of a detached entity by setting its values, but the content of this object needs to be merged back to back-end storage.  Our example code provided always uses the merge operation on amended entity data which is posted back to data storage to cover detachment scenarios.  It is suggested you follow the same pattern to avoid Detached Entity passed to merge/persist messages from the JPA provider layer.

The merge operation schedules the content of an entity to be updated in data storage.
An Entity can be removed using the remove method, which schedules the persistence provider to delete the representation in the data storage.
Note: Please note the scheduled to comment with respect to add, update, and delete actions.  The actual operations get performed by the Hibernate implementation of the JPA when the EntityManager is made to commit a transaction either by you or by the provider when it flushes.

The Persistence Unit

The Java classes defining JEE Entities need to be grouped together into their own JAR and this is known as a persistence unit.  An application can have many persistence units, and each persistence unit can be configured independently to use different back end storage.

It is sensible to organize your persistence unit to cover a particular relational data-model or object domain model that expresses the data scope of your solution.  There is usually no point in breaking up a single logical data model into multiple persistence units.

Each persistence unit JAR keeps its classes at the top level of the JAR.

A persistence unit also needs a configuration file called persistence.xml that defines the entity content of the persistence unit and the way in which the persistence unit relates to the relational data model.  In our example code, the content of META-INF/persistence.xml ensures that the application creates the database schema it needs from the JPA entity annotations it has if said schema is missing, or just uses the one already there.  So JPA gives you a way of creating schema in any database without writing the DDL.

Here is the file for our sample persistence unit:

<persistence>
    <persistence-unit name="libraryDAO"
transaction-type="RESOURCE_LOCAL">
        <!-- Add one entry per Entity class in the JAR -->
        <class>com.example.dao.entities.Book</class>
        <class>com.example.dao.entities.Category</class>
        <class>com.example.dao.entities.Library</class>
        <class>com.example.dao.entities.Member</class>
        <class>com.example.dao.entities.OnLoan</class>       

<class>com.example.dao.entities.OnLoanId</class>       
        <!-- Hibernate properties -->
        <properties>
            <property name="hibernate.ejb.cfgfile"
                      value="hibernate.cfg.xml"/>
            <property name="hibernate.hbm2ddl.auto" value="update"/>
        </properties>
    </persistence-unit>
</persistence>

The transaction type defines the transaction type of the unit.  Ours is RESOURCE_LOCAL because it does not make use of the Java Transaction API (JTA) that could have been used if this persistence unit had been deployed under JBoss.
Our persistence unit doesn’t need a JEE container to run, so doesn’t make use of JTA.  RESOURCE_LOCAL is a short way of indicating to the JPA provider to use the default JDBC transaction model.

The hibernate.ejb.cfgfile setting allows us to get the JPA provider to obtain the variable part of the configuration of the persistence unit (say the connection details) from a file that lives in the class path of the application outside the persistence unit JAR boundary.  This file is called hibernate.cfg.xml and is discussed later.

Transactions

All entity adding/deleting/altering actions need to be encapsulated in transactions within the JPA.  Within our outside-a-container implementation in the example code, the transactions bound all the operations performed by an EntityManager instance.  So, repeating a code snippet previously given, the code highlighted in red represents the transaction management for an add operation:


    public void addBook(Book a_Book) throws DAOException
    {
        EntityTransaction currentTrans = null;
        final EntityManager em = super.getEntityManager();
        try
        {
            currentTrans = em.getTransaction();
            currentTrans.begin();
            em.persist(a_Book);
            currentTrans.commit();     
        }
        catch (Exception anyE)
        {
            if (currentTrans.isActive())
            {
                currentTrans.rollback();
            }
            throw new DAOException(errorMessage, anyE);
        }
    }

As can be seen, this transaction management code represents a significant part of the code, which unfortunately is the case when you use the JPA outside a JEE container. 

Outside a JEE container you are responsible in your own code for identifying the beginning of a unit of work with a transaction.begin statement, and indicating when this unit of work is complete with a transaction.commit or transaction.rollback statement.

Complex Queries

In most cases, you can navigate the object model of the data returned from an EntityManager.find operation to get hold of the data you may need from the object’s get methods. 

Sometimes you’ll need more power than this interface offers.  To get more power, you need to create custom queries, which can be defined in the query language JPQL.

These custom queries can form part of the entity classes themselves, or can be in a file.  We chose the file route as it then means we know where to go to alter our queries without altering code.

Custom queries are scoped to a particular persistence unit by defining an orm.xml file embedded inside the persistence unit at the same directory location as the persistence.xml.

Here is the orm.xml for our example code:

<?xml version="1.0" encoding="UTF-8"?>
<entity-mappings xmlns="
http://java.sun.com/xml/ns/persistence/orm"
                 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                 xsi:schemaLocation="http://java.sun.com/xml/ns/persistence/orm http://java.sun.com/xml/ns/persistence/orm_1_0.xsd"
                 version="1.0">
    <named-query name="getAllBooks">
        <query>
            <![CDATA[
                from Book b order by b.title asc
            ]]>
        </query>
    </named-query>
    <named-query name="getBookByTitle">
        <query>
            <![CDATA[
                from Book b
                where b.title = :bookTitle
            ]]>
        </query>
    </named-query>
    <named-query name="getAllCategories">
        <query>
            <![CDATA[
                from Category c order by c.categoryname asc
            ]]>
        </query>
    </named-query>
    <named-query name="getAllLibraries">
        <query>
            <![CDATA[
                from Library l order by l.name asc
            ]]>
        </query>
    </named-query>
    <named-query name="getAllMembers">
        <query>
            <![CDATA[
                from Member m order by m.fullname asc
            ]]>
        </query>
    </named-query>
    <named-query name="getMemberByName">
        <query>
            <![CDATA[
                from Member m
                where m.fullname = :memberName
            ]]>
        </query>
    </named-query>
    <named-query name="getAllLoans">
        <query>
            <![CDATA[
                from OnLoan ol order by ol.member.fullname asc
            ]]>
        </query>
    </named-query>
    <named-query name="getLoansForMember">
        <query>
            <![CDATA[
                from OnLoan ol
                where ol.id.memberuuid = :memberID
            ]]>
        </query>
    </named-query>   
    <named-query name="getLoansForLibrary">
        <query>
            <![CDATA[
                from OnLoan ol
                where ol.member.library.libraryuuid = :libraryID
            ]]>
        </query>
    </named-query>      
</entity-mappings>

Each query identified within the file is of type named-query.  This is useful in-so-far as you can amend the caching properties of these queries to prevent them from hitting the database each time they are run, and by presenting them as named queries, the EntityManagerFactory will pre-compile these on initial startup for speedy usage.

The CDATA tag allows us to embed unruly characters in the query statements if we need to do, to avoid any XML processing errors.

Each query statement uses the Entity class names rather than the database tables names, and uses the attribute names defined in get<Attribute> and set<Attribute> on the Entity classes instead of table columns.  Make sure you ensure the first character of the attribute names used in the orm.xml is lower case, unlike the access methods on the Entity class.

A query statement can use placeholders for variable data, an example placeholder being :libraryID in the last named query in the file.  Your DAL code should take these placeholders as arguments where it needs to.

Both Java Persistence with Hibernate and EJB3 in Action have whole chapters specifically on the JPQL language.  For an online guide try the complete and excellent example provided by BEA at http://edocs.bea.com/kodo/docs41/full/html/ejb3_langref.html

External Configuration

Any persistence unit should be configurable via external configuration, to prevent us from having to rebuild a persistence unit when it needs to be tweaked to a new database location or new database flavour.  In our example code the file hibernate.cfg.xml performs this task. 

When the EntityManagerFactory for a particular Data Access Layer (DAL) is started up, the persistence.xml refers out to the hibernate.cfg.xml.  Within our sample application, the following hibernate.cfg.xml content is used:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE hibernate-configuration PUBLIC
    "-//Hibernate/Hibernate Configuration DTD 3.0//EN"
    "
http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd">
    <hibernate-configuration>
        <session-factory>
            <property name="hibernate.show_sql">true</property>
            <property name="hibernate.format_sql">true</property>
            <property name="hibernate.connection.driver_class">
                org.postgresql.Driver
            </property>
            <property name="hibernate.connection.url">
                jdbc:postgresql://localhost:5432/libraries
            </property>
            <property name=
"hibernate.connection.username">postgres</property>
            <property name="hibernate.connection.password">
hello123</property>
            <property name="hibernate.dialect">
                org.hibernate.dialect.PostgreSQLDialect
            </property>
            <property name="hibernate.c3p0.min_size">5</property>
            <property name="hibernate.c3p0.max_size">20</property>
            <property name="hibernate.c3p0.max_statements">50</property>
            <property name="hibernate.c3p0.timeout">1800</property>
            <property name="hibernate.cache.provider_class">
                org.hibernate.cache.EhCacheProvider
            </property>
         <property name="hibernate.cache.use_query_cache">true</property>           
        </session-factory>
    </hibernate-configuration>

The show attribute can be set to false for production systems and true to development systems.  With this set to true, you will see the SQL sent to the database in response to JPA queries and actions at whatever console you have going on the JVM.

The connection attributes specify the URL, username, password, and driver to use to connect.  These can be JBoss data source definition, straight-forward JDBC definitions, or HA-JDBC clusters for client side DB HA which we’ll cover in a future blog.

The c3p0 attributes are only used for connection pooling outside a JEE container.  When this is configured for JBoss execution, there will not need to be any pool definition here, the pooling will be done within JBoss via its data-source configuration.  However, c3p0 is used by the Hibernate stand alone JPA implementation for out-of-container usage.

The cache.provider_class attribute is used to inform Hibernate of a second-level caching implementation at the JVM process level.  This technique is used to stop the JPA implementation from hitting the database unless the data has changed, or is no longer in the second-level cache.
The cache.use_query_cache attribute informs Hibernate to turn on named-query caching for any suitably annotated query.

We’ll discuss caching as a feature in a follow on blog entry to this one.

Conclusions

For those of you without an in-depth knowledge of the JPA, we’re hoping this blog entry has given you enough basic information and confidence to start investigating the JPA for suitability on your projects.

In the next blog entry in the series we’ll cover implementing the ideas covered here for out-of-container usage of the JPA, specifying tools, tips, and hopefully pointing you in the right direction.

In the final blog on the series we’ll look at caching and database HA with the JPA.

Jim