jare用java实现了论文《Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions》中提出的算法——基于半监督的递归自动编码机,用来预测情感分类。详情可查看论文内容,代码git地址为:https://github.com/sancha/jrae。



FineTunableTheta tunedTheta = rae.train(params);// 根据参数和数据训练神经网络权重

      System.out.println("RAE trained. The model file is saved in "
          + params.ModelFile);
    // 特征抽取器
      RAEFeatureExtractor fe = new RAEFeatureExtractor(params.EmbeddingSize,
          tunedTheta, params.AlphaCat, params.Beta, params.CatSize,
          params.Dataset.Vocab.size(), rae.f);
    // 获取训练数据
      List<LabeledDatum<Double, Integer>> classifierTrainingData = fe
          .extractFeaturesIntoArray(params.Dataset, params.Dataset.Data,
    // 测试精度
      SoftmaxClassifier<Double, Integer> classifier = new SoftmaxClassifier<Double, Integer>();
      Accuracy TrainAccuracy = classifier.train(classifierTrainingData);
      System.out.println("Train Accuracy :" + TrainAccuracy.toString());



1、Minimizer<T extends DifferentiableFunction>

public interface Minimizer<T extends DifferentiableFunction> {

   * Attempts to find an unconstrained minimum of the objective
   * <code>function</code> starting at <code>initial</code>, within
   * <code>functionTolerance</code>.
   * @param function          the objective function
   * @param functionTolerance a <code>double</code> value
   * @param initial           a initial feasible point
   * @return Unconstrained minimum of function
  double[] minimize(T function, double functionTolerance, double[] initial);
  double[] minimize(T function, double functionTolerance, double[] initial, int maxIterations);



public interface DifferentiableFunction extends Function {
  double[] derivativeAt(double[] x);

public interface Function {
  int dimension();
  double valueAt(double[] x);


 * This code is part of the Stanford NLP Toolkit.
 * An implementation of L-BFGS for Quasi Newton unconstrained minimization.
 * The general outline of the algorithm is taken from: <blockquote> <i>Numerical
 * Optimization</i> (second edition) 2006 Jorge Nocedal and Stephen J. Wright
 * </blockquote> A variety of different options are available.
 * <h3>LINESEARCHES</h3>
 * BACKTRACKING: This routine simply starts with a guess for step size of 1. If
 * the step size doesn't supply a sufficient decrease in the function value the
 * step is updated through step = 0.1*step. This method is certainly simpler,
 * but doesn't allow for an increase in step size, and isn't well suited for
 * Quasi Newton methods.
 * MINPACK: This routine is based off of the implementation used in MINPACK.
 * This routine finds a point satisfying the Wolfe conditions, which state that
 * a point must have a sufficiently smaller function value, and a gradient of
 * smaller magnitude. This provides enough to prove theoretically quadratic
 * convergence. In order to find such a point the linesearch first finds an
 * interval which must contain a satisfying point, and then progressively
 * reduces that interval all using cubic or quadratic interpolation.
 * SCALING: L-BFGS allows the initial guess at the hessian to be updated at each
 * step. Standard BFGS does this by approximating the hessian as a scaled
 * identity matrix. To use this method set the scaleOpt to SCALAR. A better way
 * of approximate the hessian is by using a scaling diagonal matrix. The
 * diagonal can then be updated as more information comes in. This method can be
 * used by setting scaleOpt to DIAGONAL.
 * CONVERGENCE: Previously convergence was gauged by looking at the average
 * decrease per step dividing that by the current value and terminating when
 * that value because smaller than TOL. This method fails when the function
 * value approaches zero, so two other convergence criteria are used. The first
 * stores the initial gradient norm |g0|, then terminates when the new gradient
 * norm, |g| is sufficiently smaller: i.e., |g| < eps*|g0| the second checks
 * if |g| < eps*max( 1 , |x| ) which is essentially checking to see if the
 * gradient is numerically zero.
 * Each of these convergence criteria can be turned on or off by setting the
 * flags: <blockquote><code>
 * private boolean useAveImprovement = true;
 * private boolean useRelativeNorm = true;
 * private boolean useNumericalZero = true;
 * </code></blockquote>
 * To use the QNMinimizer first construct it using <blockquote><code>
 * QNMinimizer qn = new QNMinimizer(mem, true)
 * </code>
 * </blockquote> mem - the number of previous estimate vector pairs to store,
 * generally 15 is plenty. true - this tells the QN to use the MINPACK
 * linesearch with DIAGONAL scaling. false would lead to the use of the criteria
 * used in the old QNMinimizer class.




public abstract class MemoizedDifferentiableFunction implements DifferentiableFunction {
	protected double[] prevQuery, gradient;
	protected double value;
	protected int evalCount;
	protected void initPrevQuery()
		prevQuery = new double[ dimension() ];
	protected boolean requiresEvaluation(double[] x)
			return false;
		System.arraycopy(x, 0, prevQuery, 0, x.length);
		return true;
	public double[] derivativeAt(double[] x){
			return gradient;
		return gradient;














posted @ 2014-11-19 17:01 五色光 阅读(...) 评论(...) 编辑 收藏