Monthly Archives September 2010

SSO With Pentaho Community Edition

Posted by admin on September 03, 2010  /   Posted in BI and Custom Development

Introduction

Pentaho is an amazing system. Built upon countless man hours from all over the world, it is one of the testament to the effectiveness of the open-source SDLC paradigm.

But here’s the rub, those of us Community Edition users — who for various reasons cannot use the Enterprise Edition — are left on our own when it comes to the more “advanced” features … such as SSO integration.

After messing with this for the last week, with a lot of help from my colleague and probably one of the most useful and fun (yes, fun!) online user community that I’ve dealt with (the #pentaho irc channel), I finally cracked the proverbial nut.

So with my boss’ blessing, I decided to document what I had to do to make this work in the spirit of giving back to the community.  Plus with the rising awareness of the benefits of BI even for small to medium corporations, I have no doubt that this information would be useful for someone somewhere.

Due to variance in SSO setup, I am not implying that the way I set it up will work for yours. That’s all for the obligatory mini-DISCLAIMER.

The Need

If your organization does not have a Single-Sign-On implemented for your enterprise applications, then this writeup is irrelevant. The fact is, SSO is a useful, productivity-boosting feature for users (and developers too) that, while almost always a major pain to setup, the payback is usually worth the hassle.

In this writeup, my scenario revolves around trying to fit Pentaho (Community Edition), which I’ll refer to as PCE from here on, into an existing SSO implementation.

Version Information:
Pentaho Business Intelligence Server – 3.6.0-stable
Microsoft IIS – version 7.0.6000

Now, for those who are familiar with setting up an SSO system, the next question will be a basic one:

Which SSO implementation did you use?

The SSO Setup

The one that we setup at work utilizes Microsoft Active Directory to authenticate users coming from the website.

While there are some documentation on the Pentaho Wiki on plugging in Pentaho into SiteMinder or CAS, less can be found when you search for Microsoft Active Directory. Which is a shame, because despite being a Linux guy myself, I have to admit that when configured correctly, their implementation of SSO, from the users’ perspective, works fairly well.

With MSAD, once it authenticated you via the usual login screen, it will push information through AJP protocol (1.3) to the Tomcat server that hosts the Pentaho biserver-ce.

For the sake of brevity and clarity, we won’t be discussing how to setup AJP to work with IIS.  Suffice to say that it uses one of the ISAPI Filter extensions called: isapi_redirect.dll to accomplish this.

The first thing to do is to modify the conf/workers.properties and conf/uriworkermap.properties.  The way these works is routing by URL pattern.  It should be self-explanatory to modify, contact me if you need more info.

Gotcha(TM) #1: On my Ubuntu development server, somehow Tomcat AJP listener isn’t really listening to requests coming via tcp (that’s TCP for IP4 clients), rather it waits on tcp6.  And this is not obvious either, especially when you use netstat.  Netstat will show tcp6 whether Tomcat is listening to tpc *and* tcp6 OR just on tcp6.

So now what to do? After searching for answers online, I came onto a tip to specify the actual address where the server is listening on.   Somehow this forces Tomcat to listen to both tcp and tcp6 on 8009 (AJP protocol).  To be specific: add an attribute address=”your.server.ip.address” to the <Connector> tag that configures the 8009 port in the Tomcat’s server.xml

After being sidetrack by this, I finally was able to receive AJP requests  from IIS to Tomcat which in turn dutifully re-routes them to /pentaho where the PCE lives.

The Switch

At this point, all the PCE can do is to throw a fit because it does not know what to do with the AJP request coming from IIS (i.e the user) plus it has no idea that there is an authenticated user information within the request.

So we need to initiate the switch from the pre-installed JDBC-based authentication/authorization to the one based on LDAP, of which the MS Active Directory is an implementation.

To do this, you can follow the information in the link that I’m about to give you. But, come back here after you read it because while it listed the steps, it does not give you a clue on what those modifications are really for.  Well, unless you are an LDAP and Spring Security -expert.

Here’s the link.

In summary, here are the list of files you need to touch and modify:

Under biserver-ce/pentaho-solution/system directory:

  • pentaho-spring-beans.xml – the big switch, this is where you tell Pentaho to use LDAP instead of JDBC authentication/authorization system.
  • applicationContext-security-ldap.properties – this file basically is the center of the modification, we will talk about this file in depth on the next section.
  • applicationContext-spring-security.xml – this is where ACL (Access Control Level) is setup at URL level.  Search for <property name=”objectDefinitionSource”>.  Scarily, the actual URL patterns and permitted roles are defined within a hardcoded CDATA block (!!).  What’s wrong with another .properties file guys?  All you need to do here is to substitute the default Pentaho roles such as Admin, Authenticated, etc. with the new ones from LDAP.
  • applicationContext-spring-security-ldap.xml – this is where the majority of the values in the above .properties file are being used.  As far as I recall, I didn’t change this file at all, which is always a good thing.
  • applicationContext-pentaho-security-ldap.xml – this file contains the two queries that populates the Pentaho UI when we select assign permissions to Users or Roles.  See “The Exception” section below.
  • pentaho.xml – this file governs who can do what to the solutions in the repositories.  All you need to modify in this file is replacing the default roles with the ones you define in LDAP.  IMPORTANT: Anytime you modify the default settings in this file, always drop these tables from the hibernate database: pro_acls_list and pro_files (in that order), then restart the biserver, this will rebuilt the two tables with the new default permissions for the solutions.

Under biserver-ce/pentaho-solution/system/data-access-plugin directory:

  • settings.xml – this file governs who can do what to the defined data sources (you know, the all important source of  data for your ad-hoc reports and cubes).  All you need to modify in this file is replacing the default roles with the ones you define in LDAP.  A #pentaho irc channel community member (pstoellberger) helped me out on this one.  Without his quick source sleuthing, there’s no telling how many more hours I’d spend figuring this out.

Under biserver-ce/tomcat/webapps/pentaho/WEB-INF directory:

  • web.xml – this file configures the pentaho web application.  Since we are plugging Pentaho into an existing enterprise application, we need to configure it to reflect this.  All you need to do is to make sure that this section of the file is properly defined (see highlighted part below):
    • <context-param><param-name>base-url</param-name>
    • <param-value>http://www.yourwebsite.entry.point.com/pentaho</param-value>
    • </context-param>

Now let’s talk about some of these modifications in more depth.

The Property File

applicationContext-security-ldap.properties to be exact.  This is the only property file that you need to modify for this purpose.

The values in this file are being used in three Spring-Security bean definition files.  To clarify, Pentaho uses Spring-Security to implement their authentication/authorization layer.  A wise decision that pays rather handsomely as you can see later in this article.

Let’s walk through these values:

  • contextSource.providerUrl – this should contain your ldap: URL, between you and your sysadmin, this shouldn’t be a piece of cake to get.  Example: ldap://ldapsrv.acme.com:389
  • contextSource.userDn – set the value to the distinguishedName (DN) of the read-only user that you use for accessing LDAP tree.  Example: CN=LDAP Searchdog,CN=Users,DC=acme,DC=com
  • contextSource.password – the password of the read-only user above

This section takes care of the basic connecting to LDAP server.  This is used in various places as expected.

Next, we’ll fill out info for single user search (for authentication):

  • userSearch.searchBase – this value points to the root of the LDAP tree where you want the search to commence from. Example: DC=acme,DC=com
  • userSearch.searchFilter – this is the LDAP attribute that will be matched against the supplied parameter (typically the user name).  Example: (sAMAccountName={0})  <– the {0} is where the parameter would be substituted.

Next, we’ll specify how to fetch the roles of a given user (setup on LDAP):

  • populator.convertToUpperCase – when this value is ‘true’ the roles coming from LDAP will be converted into all upper case.  Not sure what this buys us, but it’s important to be consistent.  Don’t set this to true and then forgot to capitalize the roles wherever it’s defined.
  • populator.groupRoleAttribute – which LDAP attribute held the roles. Example: cn
  • populator.groupSearchBase – same as the userSearch.searchBase above
  • populator.groupSearchFilter – specifies the condition for the search, that is using the username  to get the roles he/she is associated with
  • populator.rolePrefix – if you need a prefix, I haven’t found out why would I need one.
  • populator.searchSubtree – another boolean value that indicates whether to search into the LDAP subtrees or not.

Lastly, we give the proper info for searching available roles in LDAP.  This is an important query that will actually populate the Pentaho UI where we select Roles to assign permissions to certain Reports or Cubes (or ‘Solutions’ if we use Pentaho’s lingo).

  • allAuthoritiesSearch.roleAttribute – which LDAP attribute held the value for the roles.  Example: cn
  • allAuthoritiesSearch.searchBase – where you’d want to search to begin.  IMPORTANT: the way my LDAP server is organized, when this property is set to the root of the tree (DC=acme,DC=com), the subsequent pentaho code failed to populate the UI control that allows us to select these roles.  Only when I specify a subtree that has only the roles, would this work.  Example: OU=Some Subgroup,DC=acme,DC=com
  • allAuthoritiesSearch.searchFilter – this is the criteria that is shared by all the roles we want to pull from the LDAP server.  Example: (objectClass=group)

The Exception

One LDAP query that you may want to disable is the allUsernamesSearch.  This query is defined in one of the xml files modified, called: applicationContext-spring-security-ldap.xml.

The reason that it is a good idea to disable this, is just common security/access control practice, you do not assign permissions at the users level, you define permissions with associations to roles instead.

So let’s disable the query, the way to do it is to make sure that the definition of the Spring bean points to a class that has been programmed to do nothing.  It will look something like this:

<bean id=”allUsernamesSearch” class=”org.pentaho.platform.plugin.services.security.userrole.ldap.search.NoOpLdapSearch”
/>

What this is saying is when the UI that allows the admin to assign permissions are displayed:

The users selection box is empty, thanks to the NoOpLdapSearch class defined above.  This means you can’t assign permission to an individual user.  In 99% of the cases, this is what you want.

The Usage

The last step that needs to be done after all the configuration above, is to actually use the roles defined in LDAP at the appropriate places.

‘Consistency’ is the keyword here, once you have defined a set of new roles in MS Active Directory to be used with Pentaho, then you *must* substitute default Pentaho roles (Admin, Authenticated, etc.) in the aforementioned configuration files with the appropriate new roles.

I don’t see the point of belaboring on this as the application would be unique to your own authentication/authorization needs.  Just be aware that a single typo will bring the system to a halt.  Involving some kind of version control is highly recommended when modifying these files.

The most unexpected and quite amazing fact in this whole thing is that Spring-Security automatically handles the authenticated user information that was sent from IIS to Tomcat without any intervention on my side.

Lesser security libraries would probably require some property tweaking or custom-written filters to do this seemingly trivial but important step.   This to me has proven one of the reasons for the maturity of Spring as one of the few Java frameworks that is truly enterprise -worthy.

The Loose Ends

Some miscellaneous random bits of info that would have saved me some time and effort had I known them before I started on this task:

  • The log file for Pentaho is located in: biserver-ce/tomcat/bin/pentaho.log
  • To find out about problems with your ISAPI Filters, view the log files located where the extension .dll file is.  In my case it’s called isapi_redirect.log
  • Turn the log4j.xml logging level for spring-security to INFO or even DEBUG to follow what’s happening if the modifications do not seem to take effect.  This is quite obvious, but when you’re busy pulling your hair out, it’s easy to forget.
  • Don’t forget to turn it back to WARN or ERROR when your modifications *do* work.
  • Oh, and Pentaho Administrator Console is useless once you switched to LDAP, it is only configured to work with the JDBC user/role management.

We serve businesses of any type and size
Contact Us Today