Talend Open Studio vs Pentaho Data Integration

29 03 2008

You don’t know which Open source ETL to choose? Have a look at this white paper (in French).

It contains some interesting benchmarks that can help you choosing the best ETL for your need.




30 03 2008
Matt Casters

To me it looks like the benchmark seemed rather unfair to both Talend and Kettle 🙂

30 03 2008

To be honest, I did not read the entire paper yet. But there are not so many comparisons of open source ETL on the web and this one seemed to be fair a priori. I will read it in more details…

I think that it would be interesting to have a set of public data and to define ETL test cases available to everybody. They could be used by different people to test ETL tools (open source or not).

31 03 2008
Matt Casters

Hi scorreia,

it’s my opinion of-course, but the differences between Talend and Kettle seem to be that big that the method of doing overly simplistic benchmarks and check box comparisons fail. I must admit it’s a good starting point though.

The emphasis on performance is interesting. Just by looking at the transformations superficially I can see plenty of room for improvement on the Kettle side. For example when I see a JavaScript step (slow by nature) run in a single thread on a dual core machine or when this step is not replaced by a faster Calculator step for example. I’m sure the Talend folks have the same feeling on some of these tests. The thing is, from the first tests you can deduct that Kettle engine is certainly as fast as Talend. When you then find huge differences in performance, I would expect the author to do a little optimization or at least ask around.

But we’ll get to that later.

On his blog I read that Sylvain is going to put the tests on-line somewhere at the end of the week so that we can all take a look at them. In the mean time, I wish nothing but pleasant thoughts for him in the Alsace. Having been there myself I must admit it’s among the most beautiful regions on earth.

All the best,


1 04 2008

Hi Matt,

Thanks for your comment.
I agree with you, a lot of other parameters are to be taken into account for a detailed comparison. And it is obvious that if transformation jobs were done by the respective people of PDI and TOS, a lot of optimization would probably be possible.
But we could also see this lack of optimization as an indicator of the simplicity of use of the tool in complex cases.

I am pleased to know that Sylvain will make his tests available. It will probably be useful for users to play around with these tests. And optimizations will probably be found where they are needed.

And having lived in Alsace a few years, I also admit that it is a nice place.

Best regards.

1 04 2008
Matt Casters

But we could also see this lack of optimization as an indicator of the simplicity of use of the tool in complex cases.

Normally I would agree, but in this case, the transformations where Kettle is not doing very well are radically different from the Talend implementation of the same test. For example Talend reads from the PostgreSQL database, where Kettle does not.


8 04 2008
21 04 2008
30 01 2009

Hi, I am really interested in checking this paper but, i don’t know french. Does anybody have english version of this paper?

30 01 2009


I don’t know about an English version of this paper. But you may find this benchmark interesting. It is commented on Marc Russel’s blog and on Goban Saor’s blog.

2 02 2009

Thanks a lot for this nice benchmarking doc. Do you have any benchmarking doc related to “Cognos Vs Japser or Pentaho”.

1 03 2009


sorry for this late reply, I missed your last comment. I don’t know about a document related to Cognos.

