One of the most exciting software release towards the end of 2013 is the Pentaho 5.0 CE (Community Edition) which was rolled out on November 18th 2013. While the EE (Enterprise Edition) was released a couple months prior, the CE version has always been my favorite both to work on and especially to be part of the community who is always full of new (and wonderful) ideas, and actually have the brain power to realize those. Truly one of the most interesting Open Source communities.
As a BI Consultant, I had several requests to review this new release, so without further ado, let’s take a look.
As usual, I get the zip files for each Pentaho BI Suite components from here. This is what my folder looks like this when I was done downloading:
- pad-ce-5.0.1-stable.zip – Pentaho Aggregation Designer (missing as of the time of this review, no idea where it went)
- psw-ce-3.6.1.zip – Pentaho Schema Workbench
- biserver-ce-5.0.1-stable.zip – Pentaho BI Server
- pdi-ce-5.0.1-stable.zip – Pentaho Data Integration (Kettle and Spoon)
- pme-ce-5.0.1-stable.zip – Pentaho Metadata Editor
- prd-ce-5.0.1-stable.zip – Pentaho Report Designer
Unzipping any of these zip files will “install” the component. Simple as that.
I haven’t had the time to look at PAD or PME, so we’ll review this in the future. For now let’s start with PRD.
The new Pentaho Report Designer has a very convenient and useful item on the Wizard which you see when you started the report-designer.sh (or .bat on Windows) script. It’s called “What’s New”.
It’s basically a report that we can Preview and it listed all the new features in this 5.0 release; very handy to read about the improvements. What piqued my interest especially is they seem to improve the creation of interactive HTML reports, which now can serve links to other reports within Pentaho. Maybe a new way to serve content that is somewhere between reports, dashboard, and wizard pages.
The popular Kettle (or Spoon or PDI) increase the number of steps including one that I have been waiting for: OpenERP Input and Output. Speaking of OpenERP, I need to contribute the custom OpenERP step that we developed last year into the community.
I’m also eager to try out the MongoDB steps as I started to use it for our projects. I’ll have more to say about these two wonderful tools in upcoming articles. These two are big enough to have their own reviews.
Pentaho BI Server
But the biggest changes are truly visible in the Pentaho BI Server itself. After unzipping the biserver-ce-5.0.1-stable.zip, dive into biserver-ce director and issue ./start-pentaho.sh if you are on UNIX or start-pentaho.bat if you are on Windows.
By starting the BI Server from this location, the starting scripts already set the memory allocation and other environment parameters to more reasonable values than the ones that comes default with Apache Tomcat.
After starting the server and wait for a while — or if you are familiar with Tomcat logging features, on UNIX do:
tail -f biserver-ce/tomcat/log/catalina.out
Which will allow you to see if the server starts correctly or failed with errors. On Windows, use the Tomcat Start/Shutdown application to see the logs. When you see the log files stops scrolling, bring up a browser (on the same computer) and try to hit the Tomcat server by entering http://localhost:8080/pentaho if you use the default settings. And you should see:
Yes, that’s what our version of the Pentaho User Console (PUC) looks like after a couple of customization steps:
- Change the login image:
– user@server:~/pentaho5/biserver-ce/pentaho-solutions/system/common-ui/resources/themes/crystal/images$ mv ~/your-own-similarly-sized-image.jpg ./login-crystal-bg.jpeg
- Change Pentaho to nextCoder
– user@server:~/pentaho5/biserver-ce/pentaho-solutions/system/common-ui/resources/themes/images$ mv ~/your_logo.png puc-login-logo.png
- Change the wordings of the Login page:
– user@server:~/pentaho5/biserver-ce$ vi tomcat/webapps/pentaho/jsp/PUCLogin.jsp
Gone are the usual ‘joe’ user, replaced by ‘admin’ with the same default password ‘password’. Use these to get in and you’d be greeted by the Home screen:
Again, with some modifications, you can tailor the Home screen to suit your purposes. In this case the customization step is:
- Change the content of Home
– beruin@yamato:~/pentaho5$ vi ./biserver-ce/tomcat/webapps/pentaho/mantle/home/content/welcome/index.html
If you notice, gone is the compartmentalized panes of the old PUC, replaced by a much better-flowing (plenty of white space) minimalistic-style layout.
Another paradigm switch is the central navigation (it says ‘Home’ in the above screenshot). When you click on it, a dropdown will be displayed showing the available mode. The Home -mode is what you see above, next is the Browse File -mode that looks like this:
This is another departure from the file-based pentaho-solution repository to this JCR-based one. What is JCR? Java Content Repository is a database-based content (files) repository specification that is implemented by among others Apache Jackrabbit project, which is the one being used here by Pentaho.
What does this all mean to users? In a way, it has its advantages being a database-based repository in terms of better control of metadata and versioning of the files without sacrificing ease of use, but which also means that we have to use a plugin if we want to synchronize this repository with a file system so we can Version Control our files on our own. It remains to be seen if this switch will yield its fruit down the line.
One question for the Pentaho team: Why can’t I select multiple files and do some actions with them?
Speaking of plugins, which is source of productivity in the platform, the next mode we’ll talk about is the Marketplace -mode:
In version 4.8 and before, we have to install plugins such as Saiku, CDE, CDA, CDF, etc. manually either by using the ctool-install.sh script or by unzipping files at the right folders and hope it’ll work.
The new Marketplace -mode provides a more organized way to manage plugins and their versions. Although you still have to restart the server manually after installing or upgrading these plugins, it is still miles ahead and more importantly, just a month or two after release, we started to see plugins written by developers outside of Pentaho, which is wonderful and in-line with the spirit of the community.
Next up, is the Opened -mode, which is basically a mode where it retains all of the files we are working on (both editing or opening). This mode is somewhat similar to the new Microsoft Office paradigm (starting with Office 2010).
The Scheduled -mode is an improved user interface to schedule ETL runs:
A new feature introduced is the ability to define block-out time, within which scheduled ETL will not be run. This is useful for scheduled downtime or maintenance for the host servers.
The last mode is the Administration -mode:
This is the answer to “Where is PAC?” The old Pentaho Administration Console is gone, it is now reborn as this mode. I can’t tell you how many times (with the previous version) I received raised eyebrows or dumbfounded-look when I had to explain that you have to run another server just to create a new user or assign roles. This is definitely a very welcome improvement!
Now, how about some real work. The plugins now take center stage as Pentaho CE matures as a real platform. Old favorites like CDE:
Improved with the much more professional-looking “Crystal” theme as the default. You could still switch to the old “Onyx” theme if you like.
Another good tool is Saiku Analytics, returned also thankfully:
A promising newcomer in the Analytics tool called Pivot4J is also available to install through the Marketplace -mode:
The Pivot4J has one thing that has been missing in all of the Pentaho Analytic tools, the ability to render Aggregates at the last row or column. You have no idea how many times this little feature is asked by my clients. Yes, business people loves their totals, those helped them to make better decisions. So good job for this Pivot4J team!
Is there any negatives? Yes, the charting in Pivot4J is not intuitive to me. Take a look at the above screenshot, you see four columns. When you click the interface that will generate the bar chart representation of the table, what would you expect? I expect one bar chart, with four bars each representing the columns. What did Pivot4J gave me? Four bar charts. Why?? And I don’t see any ways to merge them or change those in any way.
In summary, I couldn’t be happier with this new 5.0 release of the Pentaho CE. There is enough new features here that warrants companies to consider upgrading their Data Warehouses. What is the most exciting trend for me is the third-party plugins that starts to become available through the Marketplace. This can signal a real growth in quantity and quality of what is already one of the most useful BI suites in the market.
So to Mr. Pedro Alves and his team, big kudos, thank you, and good job. 2014 is looking like another stellar years for Open Source BI, starting with Pentaho 5.0 CE.