WAMOnline Blog


From The Top Then

Posted in Pentaho Business Intelligence Suite by bugg_tb on the August 7th, 2007

I am going to assume you’ve at least got Pentaho installed and started, if you haven’t pop over the site and pick up a copy of the installer to get you started. The first thing to make everyone aware of because it will be your life saver, is Mondrian the MDX server. Now I have to admit that when I first installed Pentaho I knew absolutely nothing about MDX, Mondrian or anything of that nature, hell I hadn’t used SQL since I was at uni a few years ago. So my boss gave me the task of bringing the 100’s of reports we create every month together into a more centralised and automated fashion, so I started looking at ways of doing this. I started by scripting a website, creating the tables and code straight in PHP, HTML and an excel generator for some proper output, I got the first few reports done but there had to be a better solution.. and on a budget. I played around with a few CMS systems to see if they fit the bill but they still involved a lot of coding, the eventual output looked better but creating the reports was going to be a chore. Then I stumbled across Pentaho, it look just like what I wanted, an open source(I’ve been using Linux for years), web based, business intelligence suite. So I got it installed and looked at the sample data, bingo I thought, but how does this all work then? I uploaded some of my data to the database and set to work trying to figure out how it all worked. I played for days trying to generate useful data just with SQL and various report types, but never getting close to what I wanted, I had picked one of the harder reports to start with but what the hell. Then I thought what about this MDX stuff I’ve been seeing whist playing around and my quest for the best reporting system began.

MDX history

MDX stands for Multidimensional Expressions it is a query language for Online Analytical Processes(OLAP) Databases and cleverly it allows relatively easy calculations using formula’s similar to spreadsheets. To my surprise it was actually invented by a guy at Microsoft, they’ve come up with a useful specification and actually seemed to keep it pretty quiet. So unsurprisingly it runs on MSSQL, MYSQL, Oracle, DB2, the usual culprits.

How it works

Well MDX with Pentaho works by creating a Cube which references tables in a main database. In this Cube you will define ‘Dimensions’, ‘Hierarchies’, ‘Levels’ and ‘Measures’ these create the levels of information that the MDX server uses to reference. So for example you may have:

<Dimension name="TIME" foreignKey="DATE_ID">
<Hierarchy name="TIME" hasAll="true" allMemberName="All time" primaryKey="DATE_ID">
<Table name="TIME"/> <Level name="YEAR" table="TIME" column="YEAR_ID" uniqueMembers="false"/>
<Level name="QUARTER" table="TIME" column="QUARTER_ID" uniqueMembers="false"/>
<Level name="MONTH" table="TIME" column="MONTH_ID" uniqueMembers="false"/>
</Hierarchy>
</Dimension>

So as you can see first we define the dimension which in this case is called ‘Time’ which references the foreign key ‘Date_ID’.

Next we define a hierarchy which in this case is also called ‘Time’.

Then we define the levels which are reasonably self expanatory these decend in diminsihing size as it quite obvious with this date example.

<Measure name="TOTAL_SALES" column="TURNOVER_EUR" aggregator="sum" datatype="Numeric" formatString="# ##0,###"/>

Lastly we can create Measures which do pretty much do as the name suggests and we use them as the core data in our tables. The name is the name you want to see in the cube, the column if the column name from the database, there are several different aggregators your can set as with datatypes and formatStrings.

But don’t worry this doesn’t all have to be coded by hand Pentaho have developed a Cube Designer which you can download and use to create and upload the cubes to your server. This is much easier than creating by hand, the only problem being at the moment your can’t reopen a saved cube for editing, so once you’ve published your cube check it to make sure its as you expect before you close the designer, because you can always make adjustments and re-upload. Once the cube is published you can always edit the xml file directly, it will be called cubename.mondrian.xml. Take a look and check out its structure.

Anyway, I’ll be back later with the basics of starting query writing.

Leave a Reply