XML to POJO via Groovy

Transforming data in XML to several Java-Objects is a pretty common task. There are a lot of technologies that support this. One example would be JAXB. It is also not uncommon that the structure of the original XML and the Java objects does not match. The obvious choice for XML transformation in such a situation would be XSLT.
There is another approach that uses Groovy to transform XML directly into a POJO structure in one step. I will show you how this looks like.

Lets assume, we have a XML-file like this:

<data>
  <person>
    <firstname>Otto</firstname>
    <lastname>Mueller</lastname>
    <street>Lange Strasse</street>
    <city>Berlin</city>
  </person>
  <financial>
    <netincome>120000</netincome>
  </financial>
</data>

The target structure are two POJOs that look like this:
Target Objects

Both structures differ. There are also some small conversions to do. The XML includes yearly income, where the POJOs needs monthly income. The POJOs contains a field for zipcode, but the XML does not deliver it.

So, here is the Groovy-Code that does all the transformation:

public class XmlTransformer
{

  Person transform(File xml)  //1
  {
    def xmlData = new XmlSlurper().parse(xml)   //2

    Person person = new Person()

    person.firstName = xmlData.person.firstname   //3
    person.lastName = xmlData.person.lastname

    Address address = new Address()
    address.city = xmlData.person.city
    address.street = xmlData.person.street
    address.zipCode = ZipCodeFinder.find(xmlData.person.city.toString(),   //4
            xmlData.person.street.toString())
    person.address = address

    BigInteger yearlyIncome = new BigInteger(xmlData.financial.netincome.toString())
    person.monthlyIncome = yearlyIncome.divide(12)  //5

    return person
  }

}

What happens in detail (numbers on the list match the numbers in code comments):

  1. This is a simple Groovy class that contains one method called transform. This method gets the XML-file as input and returns a Person object. The Person class itself is defined in Java.
  2. XMLSlurper  is the Groovy XML-Parser that allows the easy access to all xml elements. For more details on it see here.
  3. This uses some Groovy magic. On the left side we assign the value to the property firstName of the person object. This is a shortcut for using person.setFirstName(). On the right side we see the slurper at work to get us the XML value.
  4. We can call any Java-class from within the transformation code. Here we call an example helper-class, that would return a zipcode for a given city and street.
  5. Finally we also do some calculations within our transformation to convert the yearly income to a monthly value.

Groovy can be called from Java in several ways. If you do not need to change the transformation often I recommend to just compile the Groovy code. In that case it will be called from Java as if it would be Java code:

XmlTransformer transformer = new XmlTransformer();
Person person = transformer.transform(inputFile);

This kind of transformation code can easily be embedded into a Java project. There are several advantages, when using the Groovy method:

  • Just one step for transformation including conversions.
  • Easy debugging, when using an IDE like IntelliJ IDEA.
  • Java-like syntax, no learning of XSLT required.
  • Easy code-structuring. The transformation can use all features of a dynamic object-oriented language.
  • Simple unit-testing. If you split the code into several methods, you can test each logical unit using established frameworks like JUnit or TestNG.

So what is your opinion? What are your current methods of transforming XML to POJOs?

Be Sociable, Share!
  1. Rayk says:

    Off topic, but: It’s great you re-aranged your tags, seems helpful for the usability of this site. POJO baby, behave!

  2. strug says:

    hi jörg,
    nice to see that you use 2 space indent as well :) sorry, off topic.

    do you know if there is s.th. similar to export our objects into xml?

    do you know if this solution works with very large documents as well? it looks like the xml is loaded into memory as a whole.

    one alternative we currently use, using a xsd and generate the java files with jaxb. next, the xml is automatically mapped into these interface objects. next, you need to map from interface object to domain object.

  3. Joerg says:

    @Rayk
    Yes, WordPress behaves different than delicious. I just had to learn it :)

    @strug
    The mechanism for export is the XML builder. Pretty elegant too, but may be another article.

    As for the memory consumption, I have not tried it myself, but XmlSlurper is supposed to be more memory efficient than normal XML parsers. It uses some lazy initialization and reads only stuff that is asked for by the GPATH expressions.

  4. Rayk says:

    Me again, now on topic: It’s a pitty that I need to work with ABAP, nothing like this available, nor possible. XML transformation is done in old school ways, but on single object level. So if there is any nesting we need to resolve it on our own. What happend before is that some black box converts a XML message into a deep structure, which we convert into rows for tables. No objects there.

    I have a litte problem with the sketched solution. From modeling perspective I would create a second tranform method which creates adress objects, so I could maybe create alternate delivery adresses for orders etc. But how to pass the file content to avoid re-reading of the whole stuff? Another method to encapsulate file opening from file processing? I follow strug that one could run into memory issues here.

    Regards
    Rayk

    PS: I really L O V E the free yWorks editor

  5. Joerg says:

    @Rayk,

    Yes, I would split the whole method too. I just did not do it for demo purposes. If you want to pass the xml, just pass the xmlData Object. This is just a pointer to the slurper and passing it will not cause any re-read.

    The memory issue is interessting. I will have to try it.

  6. Rayk says:

    @Jörg: That it’s slim for demo purposes, right. But what is the implication if you do it properly? Have a file-open wraper around every object transformation method? I’m off programming, but could you let me know a smart and simple solution to that without producing masses of code?

  7. Joerg says:

    @Rayk

    To move the address code to another method I would simply refactor it like (Pseudocode follows)

    —————————————————
    Person transform(File)
    {
    def xmlData = new XmlSlurper().parse(xml) // only done once

    //do the person stuff

    Address address = doAdressStuff(xmlData)
    person.address = address
    // do the rest of the person stuff
    return person
    }

    Address doAddressStuff(xmlData)
    {
    //do the address stuff like in source example
    return address
    }
    —————————————————

    In that example Address is part of the object tree, that needs to be returned. If the transformation ends in more than one root object, we could let transform return a Map instead of the root object.

    The parse call to the slurper happens only once in any case.

  8. By the way, a very minor comment, you could do yearlyIncome / 12 instead of divide(12) :-)

  9. Joerg says:

    Hi Guillaume,

    Damned. You found it. I thought I can trick everybody :)
    In fact I did try the approach with yearlyIncome / 12. Unfortunatly this resulted in a BigDecimal of a different scale, which means they are not equal and my test failed. When using .divide(12) the scale was correct. So I did it this way and hoped that nobody would find out.

    Do you have any idea why .divide(12) and /12 behave differently?

  1. There are no trackbacks for this post yet.

Leave a Reply