saturn-java-kettle代理通道实现
saturn调用kettle有两种方案:
1 shell调用
2 java代理
第2种方案除了监控kettle回报,还可以实现热部署(saturn目前是不支持热部署的,只能灰度发布)
本文实现第二种方案,分为3步:
1 java类,符合saturn约定
2 java调用kettle文件
3 将第2步的java类调整,符合saturn约定
1
public class DemoJob extends AbstractSaturnJavaJob {
@Override
public SaturnJobReturn handleJavaJob(final String jobName, final Integer shardItem, final String shardParam, final SaturnJobExecutionContext context) {
// do what you want here ...
try {
Thread.sleep(10000);
} catch (InterruptedException e) {
}
// 返回一个SaturnJobReturn对象,默认返回码是200表示正常的返回
return new SaturnJobReturn("我是demojob分片"+shardItem+"的处理结果");
}
}
<!-- https://mvnrepository.com/artifact/com.vip.saturn/saturn-job-api -->
<dependency>
<groupId>com.vip.saturn</groupId>
<artifactId>saturn-job-api</artifactId>
<version>3.0.1</version>
</dependency>
</dependencies>
<build>
<finalName>MyTest</finalName>
<pluginManagement><!-- lock down plugins versions to avoid using Maven defaults (may be moved to parent pom) -->
<plugins>
<plugin>
<groupId>com.vip.saturn</groupId>
<artifactId>saturn-plugin</artifactId>
<!-- 版本与saturn-job-api一致 -->
<version>3.0.1</version>
</plugin>
</plugins>
</pluginManagement>
</build>
2
public static void main(String []args){
runJob(new HashMap<String, String>(), "e:\\test.kjb");
}
public static void runJob(Map<String,String> maps, String jobPath) {
try {
KettleEnvironment.init();
// jobname 是Job脚本的路径及名称
JobMeta jobMeta = new JobMeta(jobPath, null);
Job job = new Job(null, jobMeta);
// 向Job 脚本传递参数,脚本中获取参数值:${参数名}
// job.setVariable(paraname, paravalue);
Set<Map.Entry<String, String>> set=maps.entrySet();
for(Iterator<Map.Entry<String, String>> it = set.iterator(); it.hasNext();){
Map.Entry<String, String> ent=it.next();
job.setVariable(ent.getKey(), ent.getValue());
}
job.start();
job.waitUntilFinished();
if (job.getErrors() > 0) {
throw new Exception(
"There are errors during job exception!(执行kettle job发生异常)");
} else {
System.out.print("success");
}
} catch (Exception e) {
e.printStackTrace();
throw new RuntimeException(e);
}
}
参考:https://www.cnblogs.com/superJF/p/5937611.html
其中有2种类型调用:
调用kettle的转换文件
调用kettle的job
这里以第2种实践
kettle的job文件放在e:/
期间需要非maven的方式引入kettle的jar包,位于kettle/lib文件夹下,jar包较多,故参考了:
https://blog.csdn.net/cakecc2008/article/details/79291620
调用Kettle作业需要引用jar包最小集
经过反复测试得到,java调用Kettle时引用第三方jar包的最小集为:
kettle-engine-7.0.0.0-25.jar
kettle-core-7.0.0.0-25.jar
metastore-7.0.0.0-25.jar
slf4j-log4j12-1.7.7.jar
javassist-3.20.0-GA.jar
guava-17.0.jar
commons-vfs2-2.1-20150824.jar
pentaho-vfs-browser-7.0.0.0-25.jar
commons-logging-1.2.jar
jsch-0.1.46.jar
js-1.7R3.jar
commons-codec-1.9.jar
如果使用到数据库,则需自行引用数据库相关的jar包,如ojdbc14.jar等
3
使用saturn约定形式封装:
import com.vip.saturn.job.AbstractSaturnJavaJob;
import com.vip.saturn.job.SaturnJobExecutionContext;
import com.vip.saturn.job.SaturnJobReturn;
import org.pentaho.di.core.KettleEnvironment;
import org.pentaho.di.job.Job;
import org.pentaho.di.job.JobMeta;
import java.io.File;
import java.io.FileWriter;
import java.io.PrintWriter;
import java.util.*;
public class KettleJava extends AbstractSaturnJavaJob {
public static void main(String []args){
runJob(new HashMap<String, String>(), "e:\\test.kjb");
}
@Override
public SaturnJobReturn handleJavaJob(final String jobName, final Integer shardItem, final String shardParam, final SaturnJobExecutionContext context) {
System.out.println("saturn 传入kettle job 文件路径:" + shardParam);
runJob(new HashMap<String, String>(), shardParam);
return new SaturnJobReturn("我是kettle job分片"+shardItem+"对 kettle job 文件(" + shardParam +")的处理结果");
}
public static void runJob(Map<String,String> maps, String jobPath) {
try {
KettleEnvironment.init();
// jobname 是Job脚本的路径及名称
JobMeta jobMeta = new JobMeta(jobPath, null);
Job job = new Job(null, jobMeta);
// 向Job 脚本传递参数,脚本中获取参数值:${参数名}
// job.setVariable(paraname, paravalue);
Set<Map.Entry<String, String>> set=maps.entrySet();
for(Iterator<Map.Entry<String, String>> it = set.iterator(); it.hasNext();){
Map.Entry<String, String> ent=it.next();
job.setVariable(ent.getKey(), ent.getValue());
}
job.start();
job.waitUntilFinished();
if (job.getErrors() > 0) {
throw new Exception(
"There are errors during job exception!(执行kettle job发生异常)");
} else {
System.out.print("success");
}
} catch (Exception e) {
e.printStackTrace();
throw new RuntimeException(e);
}
}
}
打包时遇到问题,由于saturn是maven打包,而此前的kettle java类是非maven导入jar包,故使用maven无法编译,经过实测,将依赖的jar包分为两类:
中央仓库有的,直接pom引用
没有的,install本地或deploy私服,经过实践,得出以下pom:
<dependencies>
<!-- https://mvnrepository.com/artifact/commons-codec/commons-codec -->
<dependency>
<groupId>commons-codec</groupId>
<artifactId>commons-codec</artifactId>
<version>1.9</version>
</dependency>
<!-- https://mvnrepository.com/artifact/commons-logging/commons-logging -->
<dependency>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
<version>1.1.3</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.commons/commons-vfs2 -->
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-vfs2</artifactId>
<version>2.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.google.guava/guava -->
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>17.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/javassist/javassist -->
<dependency>
<groupId>javassist</groupId>
<artifactId>javassist</artifactId>
<version>3.12.1.GA</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.slf4j/slf4j-log4j12 -->
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.7.7</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/com.jcraft/jsch -->
<dependency>
<groupId>com.jcraft</groupId>
<artifactId>jsch</artifactId>
<version>0.1.46</version>
</dependency>
<dependency>
<groupId>com.jds.fast</groupId>
<artifactId>kettle-core</artifactId>
<version>6.1.0.1</version>
</dependency>
<dependency>
<groupId>com.jds.fast</groupId>
<artifactId>kettle-engine</artifactId>
<version>6.1.0.1</version>
</dependency>
<dependency>
<groupId>com.jds.fast</groupId>
<artifactId>mysql-connector-java-bin</artifactId>
<version>5.1.15</version>
</dependency>
<dependency>
<groupId>com.jds.fast</groupId>
<artifactId>pentaho-vfs-browser</artifactId>
<version>6.1.0.1</version>
</dependency>
<dependency>
<groupId>com.jds.fast</groupId>
<artifactId>metastore</artifactId>
<version>6.1.0.1</version>
</dependency>
<dependency>
<groupId>com.jds.fast</groupId>
<artifactId>js</artifactId>
<version>1.7R3</version>
</dependency>
其中有6个jar包中央仓库没有,需要额外安装:
mvn install:install-file -Dfile=kettle-core-6.1.0.1-196.jar -DgroupId=com.jds.fast -DartifactId=kettle-core -Dversion=6.1.0.1 -Dpackaging=jar mvn install:install-file -Dfile=kettle-engine-6.1.0.1-196.jar -DgroupId=com.jds.fast -DartifactId=kettle-engine -Dversion=6.1.0.1 -Dpackaging=jar mvn install:install-file -Dfile=mysql-connector-java-5.1.15-bin.jar -DgroupId=com.jds.fast -DartifactId=mysql-connector-java-bin -Dversion=5.1.15 -Dpackaging=jar mvn install:install-file -Dfile=pentaho-vfs-browser-6.1.0.1-196.jar -DgroupId=com.jds.fast -DartifactId=pentaho-vfs-browser -Dversion=6.1.0.1 -Dpackaging=jar mvn install:install-file -Dfile=metastore-6.1.0.1-196.jar -DgroupId=com.jds.fast -DartifactId=metastore -Dversion=6.1.0.1 -Dpackaging=jar mvn install:install-file -Dfile=js-1.7R3.jar -DgroupId=com.jds.fast -DartifactId=js -Dversion=1.7R3 -Dpackaging=jar
编译成功,调用试下:
能够成功运行的kettle文件:

改变一下kettle中的sql,故意使其异常

java代理取得kettle的返回回报error后,抛出一个运行期异常给saturn,使saturn能监控到异常的任务
最后解释下,为什么这种方式可以是热部署
saturn的executor在替换jar包时,需要重启生效,但这个java包,以kettle路径作为输入参数,是不需要重复部署的,只需要每次有新任务时,放置好kettle文件,新建一个任务,仍然使用该代理jar包作为任务实现类,仅仅改变一下传入参数(kettle路径字符串)即可运行一个新任务,无需重启executor
11.21 更新
背景:kettle的日志没有,即使返回error,我们已无法在java环境中打印出这些error,因此要集成kettle的日志环境:
public class KettleJava extends AbstractSaturnJavaJob {
private static final Logger LOGGER = LoggerFactory.getLogger(KettleJava.class);
static {
try {
KettleEnvironment.init();
KettleClientEnvironment.getInstance().setClient(KettleClientEnvironment.ClientType.CARTE);
KettleLogStore.getAppender().addLoggingEventListener(kettleLoggingEvent -> {
String msg = kettleLoggingEvent.getMessage().toString();
LogLevel logLevel = kettleLoggingEvent.getLevel();
switch (logLevel) {
case ERROR:
LOGGER.error(msg);
break;
default:
LOGGER.info(msg);
}
});
} catch (KettleException e) {
LOGGER.error("kettle error", e);
}
}
public static void main(String[] args) throws IOException {
runJob(new HashMap<String, String>(), "E:\\xuej\\get_user_statistics.kjb");
}
@Override
public SaturnJobReturn handleJavaJob(final String jobName, final Integer shardItem, final String shardParam, final SaturnJobExecutionContext context) {
System.out.println("saturn 传入kettle job 文件路径:" + shardParam);
runJob(new HashMap<String, String>(), shardParam);
return new SaturnJobReturn("我是kettle job分片" + shardItem + "对 kettle job 文件(" + shardParam + ")的处理结果");
}
public static void runJob(Map<String, String> maps, String jobPath) {
try {
KettleEnvironment.init();
// jobname 是Job脚本的路径及名称
JobMeta jobMeta = new JobMeta(jobPath, null);
System.out.println("~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" + jobMeta.getLogTables());
Job job = new Job(null, jobMeta);
// 向Job 脚本传递参数,脚本中获取参数值:${参数名}
// job.setVariable(paraname, paravalue);
Set<Map.Entry<String, String>> set = maps.entrySet();
for (Iterator<Map.Entry<String, String>> it = set.iterator(); it.hasNext(); ) {
Map.Entry<String, String> ent = it.next();
job.setVariable(ent.getKey(), ent.getValue());
}
job.start();
job.waitUntilFinished();
if (job.getErrors() > 0) {
throw new Exception(
"There are errors during job exception!(执行kettle job发生异常)");
} else {
System.out.print("success");
}
} catch (Exception e) {
// e.printStackTrace();
LOGGER.error("runJob error,", e);
throw new RuntimeException(e);
}
}
}
添加logback.xml,pom增加几个包,都是中央仓库有的:
<!-- https://mvnrepository.com/artifact/org.slf4j/slf4j-api -->
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.25</version>
</dependency>
<!-- https://mvnrepository.com/artifact/ch.qos.logback/logback-core -->
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-core</artifactId>
<version>1.2.3</version>
</dependency>
<!-- https://mvnrepository.com/artifact/ch.qos.logback/logback-classic -->
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.2.3</version>
<!--<scope>test</scope>-->
</dependency>
<dependency>
<groupId>commons-lang</groupId>
<artifactId>commons-lang</artifactId>
<version>2.5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/commons-httpclient/commons-httpclient -->
<dependency>
<groupId>commons-httpclient</groupId>
<artifactId>commons-httpclient</artifactId>
<version>3.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.sun.mail/javax.mail -->
<dependency>
<groupId>com.sun.mail</groupId>
<artifactId>javax.mail</artifactId>
<version>1.4.7</version>
</dependency>
11.28更新
+ <!-- https://mvnrepository.com/artifact/mysql/mysql-connector-java -->
<dependency>
- <groupId>com.jds.fast</groupId>
- <artifactId>mysql-connector-java-bin</artifactId>
- <version>5.1.15</version>
+ <groupId>mysql</groupId>
+ <artifactId>mysql-connector-java</artifactId>
+ <version>5.1.47</version>
</dependency>
<dependency>
@@ -151,7 +158,12 @@
<version>1.4.7</version>
</dependency>
-
+ <dependency>
+ <groupId>org.safehaus.jug</groupId>
+ <artifactId>jug</artifactId>
+ <version>2.0.0</version>
+ <classifier>lgpl</classifier>
+ </dependency>
</dependencies>
done
浙公网安备 33010602011771号