Pentaho is slow for servers with too many home directories

Over the course of two years, browsing solutions on our Pentaho 5.4 server became progressively slow. It came to a point in which you had to wait 2-3 minutes to see the list of solutions in the Pentaho User Console.

The catalina log didn’t say much and we didn’t have too many solutions (around 200), so I though it was perhaps a database bottleneck. It all came to a screeching halt on a Friday afternoon (as usual) when, after a restart, the Pentaho Console simply stopped responding.

I turned all logging on and found that Pentaho was complaining about a lot of invalid users. Googling around I found that 5.4 performs user permissions tests on first login, calling UserDetailService for each home directory owner in the Home directory. Examining logs, we had over 4000 folders in there, accumulated from two authentication scheme changes. I could not even open the Home folder in the user console.

Pentaho versions 6.1 and over have a config flag to skip this user verification. It’s called skipUserVerificationOnPrincipalCreation, inside pentaho-solutions/system/jackrabbit/security.properties

More info at Jackrabbit Repository Performance Tuning

 

All fine and dandy, but what to do with a Pentaho 5.4 server. Or even, how to fix this after your PUC becomes unresponsive?

I thought that the Pentaho REST API might help and, sure enough, we can delete folders with it. In our case, our users don’t save anything in their home folders, so all we needed to do was to delete these 4000+ folders.

This is a nuclear option, so don’t run this unless you know what you are doing. If your users have solutions saved in their home folders, you need to amplify the following script to check for that and back up the solution and/or avoid deletion.

Open up any browser javascript console, replace <server_url> by your pentaho url and run:

$.getJSON("http://<server_url>:8080/pentaho/api/repo/files/:home/children", function(data){
    $.each(data, function(i, nodes){
        nodes.forEach(function(node){
            console.log(node.path,node.id);
            jQuery.ajax({
                async:false,
                type: "PUT",
                url: "http://<server_url>:8080/pentaho/api/repo/files/deletepermanent",
                data: node.id
            });
        })
    })
});

You can fork the gist for this at https://gist.github.com/danielpradilla/72a603a5d0de71771e0b5836bde05479

 

 

 

I'm a software architect and I help people solve their problems with technology. In this site, I write about how to seize the opportunities that a hyperconnected world offers us. How to live simpler and more productive lives. I invite you to check the "Best of" section. If you want to contact me, or work with me, you can use the social links below.

TAGS: