From fd07bfc58ab5ce8d5ee41d5e563d3c8d42e992db Mon Sep 17 00:00:00 2001
From: Darlene Stewart <Darlene.Stewart@nrc-cnrc.gc.ca>
Date: Tue, 28 May 2019 02:42:06 -0400
Subject: [PATCH] Added support for subword unit view and corresponding subword
 unit embeddings.

Squashed merge commit of the following:

commit e9367f8f94ffd09fe76b0c12d9ace54a79533a4f
Author: Darlene Stewart <Darlene.Stewart@nrc-cnrc.gc.ca>
Date:   Mon May 27 18:18:33 2019 -0400

    Fixed YiSi SRL test reference files yet again.

commit a7145d50093c360642b8da68c2501c5333973ab5
Merge: ea751e4 d1fe9d8
Author: Darlene Stewart <Darlene.Stewart@nrc-cnrc.gc.ca>
Date:   Mon May 27 17:54:35 2019 -0400

    Merge branch 'dev.merge.NRC-private'

    Merge NRC-private commits db8a070 through f35fd82:
    git cherry-pick -e -x db8a070
    git cherry-pick -e -x db8a070..96a8a7d
    git cherry-pick -e -x f35fd82

    Also, fixed code formatting and YiSi SRL test reference files.

commit d1fe9d816c06940cd91a85beb0ca63680bae9114
Author: Darlene Stewart <Darlene.Stewart@nrc-cnrc.gc.ca>
Date:   Mon May 27 17:39:21 2019 -0400

    Fixed the YiSi SRL test reference files again.

commit 7f3931c178255a4a1ecac013691cbee915f3a8e6
Author: Darlene Stewart <Darlene.Stewart@nrc-cnrc.gc.ca>
Date:   Mon May 27 17:43:36 2019 -0400

    Code formatting fixes

commit d0022bcd5b53b5787f2ca4e8c628d345d4e2c8f7
Author: Darlene Stewart <Darlene.Stewart@nrc-cnrc.gc.ca>
Date:   Mon May 27 14:27:02 2019 -0400

    Fixed copyright year.

commit 748df7c0dcef368720dc7b535d1838d6019bb9ad
Author: Darlene Stewart <Darlene.Stewart@nrc-cnrc.gc.ca>
Date:   Thu May 23 17:14:16 2019 -0400

    Update YiSi test reference files to match output on NRC-private branch.

    (cherry picked from NRC-private branch commit f35fd82e646f1d60143b4976ff8d58c93760544b)

commit 986b61974cfc8ed08e7e06d434b8195916f457e0
Author: Darlene Stewart <Darlene.Stewart@nrc-cnrc.gc.ca>
Date:   Wed May 22 16:59:44 2019 -0400

    Fixed an int-size_t comparison causing a g++ warning in yisi::read_sent().

    (cherry picked from NRC-private branch commit 96a8a7d652888f05bc5fa95c60c243591cafc543)

commit 08b31bfa966af2cb4e965ecb97dfc4cf8047c7ba
Author: Darlene Stewart <Darlene.Stewart@nrc-cnrc.gc.ca>
Date:   Wed May 22 16:20:00 2019 -0400

    Update YiSi test reference files to match the current state of the NRC-private branch.

    (cherry picked from NRC-private branch commit 5b5f4a25e69625cd1b8a0f7a8d8974e0ce2b0e60)

commit 5f60ca583f0b127431c04b87dcab6d606ca6214b
Author: Darlene Stewart <Darlene.Stewart@nrc-cnrc.gc.ca>
Date:   Wed May 22 14:39:23 2019 -0400

    Fixed read_con1109 to set the sentence tokens for 'word' type.

    (cherry picked from NRC-private branch commit 2bd0c3f9e1c9029494619d2175c187a9605ca635)

commit 745004781aeabbdd1cbdfbed198de473f1706ab2
Author: Jackie Lo <loc982@usr-svc-sci-ctn070.science.gc.ca>
Date:   Tue May 14 18:27:46 2019 -0400

    Bug fix for reading srl parse in conll09 format.

    (cherry picked from NRC-private branch commit 4b8740d2ef1c23390bcef3d0fe4a6acdcc446dfe)

commit 875171ae255dbe5a5ae7ce275e2aed5d13819373
Author: Jackie Lo <loc982@usr-svc-sci-ctn070.science.gc.ca>
Date:   Wed May 8 07:24:43 2019 -0400

    Rewriting confusing progress message in main function.

    (cherry picked from NRC-private branch commit f29066dceba8ec5ac92b33e5142d7075f78aeafc)

commit 783ee3611d90a46503528c21b6b1b3e1efd96f69
Author: Jackie Lo <loc982@usr-svc-sci-ctn070.science.gc.ca>
Date:   Wed May 8 07:15:35 2019 -0400

    Another bug fix in reading conll09 formatted srl.

    (cherry picked from NRC-private branch commit 675e73d35dd231d39cc415a0e7311f56052af981)

commit 7d038c0d2d49b6d96182326d3174899bfa2fef03
Author: Jackie Lo <loc982@usr-svc-sci-ctn070.science.gc.ca>
Date:   Wed May 8 00:01:41 2019 -0400

    Bug fix in reading conll09 srl format.

    (cherry picked from NRC-private branch commit c3119343e2cceda62c37f3aed559c5f3014f7573)

commit fb6e0b69589294f9bda98efd25ee143ba6341a98
Author: Jackie Lo <loc982@usr-svc-sci-ctn070.science.gc.ca>
Date:   Tue May 7 16:00:15 2019 -0400

    Added data structure for sentence.

    (cherry picked from NRC-private branch commit d5cdb883271ef8a30aedb5727613b57fb799821c)

commit efa1ea1014ed88910e42572720cf626423bc1713
Author: Jackie Lo <loc982@usr-svc-sci-ctn070.science.gc.ca>
Date:   Tue May 7 15:59:09 2019 -0400

    Added some handy tools for general w2v embeddings analysis.

    (cherry picked from NRC-private branch commit 7bd1fca5447cfc176da26a48fdebb90f9d602da7)

commit 2d81344c61d7e1dbeac55ab5943715c293c0ddd1
Author: Jackie Lo <loc982@usr-svc-sci-ctn070.science.gc.ca>
Date:   Tue May 7 12:38:12 2019 -0400

    Redesign of data structure for sentences to support additional subword unit view and corresponding subword unit embeddings.

    (cherry picked from NRC-private branch commit 1df49fec9be3196adfbd5ab990dd7a323ec9c7f6)
---
 src/Makefile                       |    3 +-
 src/emap_test.cpp                  |   74 ++
 src/lexsim.cpp                     |   93 +-
 src/lexsim.h                       |   26 +-
 src/ngram_test.cpp                 |   43 +
 src/oov_test.cpp                   |   39 +
 src/overlapvocab_test.cpp          |   49 ++
 src/phrasesim.h                    |  286 ++++--
 src/sent.cpp                       |  248 ++++++
 src/sent.h                         |   65 ++
 src/srl.cpp                        |    4 +-
 src/srl.h                          |    4 +-
 src/srl_test.cpp                   |   33 +-
 src/srlgraph.cpp                   |   80 +-
 src/srlgraph.h                     |   23 +-
 src/srlgraph_test.cpp              |   19 +-
 src/srlmate.cpp                    |   45 +-
 src/srlmate.h                      |    6 +-
 src/srlmate_test.cpp               |   18 +-
 src/srlutil.cpp                    |  212 +++--
 src/srlutil.h                      |   19 +-
 src/srlutil_test.cpp               |    6 +-
 src/util.cpp                       |   28 +-
 src/util.h                         |   12 +-
 src/yisi.cpp                       |  178 +++-
 src/yisigraph.cpp                  |   33 +-
 src/yisigraph.h                    |  161 ++--
 src/yisiscorer.h                   | 1298 ++++++++++++++--------------
 src/yisiscorer_test.cpp            |   12 +-
 test/ref/srlgraph_test.out         |    2 +-
 test/ref/srlutil_test.out          |   20 +-
 test/ref/test_hyp.docyisi0         |    2 +-
 test/ref/test_hyp.docyisi1_srl     |    2 +-
 test/ref/test_hyp.docyisi1_srl.alt |    2 +-
 test/ref/test_hyp.docyisi2_srl     |    2 +-
 test/ref/test_hyp.docyisi2_srl.alt |    2 +-
 test/ref/test_hyp.sntyisi0         |   20 +-
 test/ref/test_hyp.sntyisi1_srl     |   20 +-
 test/ref/test_hyp.sntyisi1_srl.alt |   20 +-
 test/ref/test_hyp.sntyisi2_srl     |   16 +-
 test/ref/test_hyp.sntyisi2_srl.alt |   16 +-
 test/ref/test_ref.en.srl           |   20 +-
 test/ref/test_ref.en.srl.alt       |   22 +-
 test/ref/test_yisi_0.out           |    6 +-
 test/ref/test_yisi_1.out           |    6 +-
 test/ref/test_yisi_1_srl.out       |    6 +-
 test/ref/test_yisi_2.out           |    6 +-
 test/ref/test_yisi_2_srl.out       |    6 +-
 48 files changed, 2112 insertions(+), 1201 deletions(-)
 create mode 100644 src/emap_test.cpp
 create mode 100644 src/ngram_test.cpp
 create mode 100644 src/oov_test.cpp
 create mode 100644 src/overlapvocab_test.cpp
 create mode 100644 src/sent.cpp
 create mode 100644 src/sent.h

diff --git a/src/Makefile b/src/Makefile
index 4c2aff4..030ef51 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -12,7 +12,7 @@
 # Override the value of MATEPLUS_HOME with a command line definition, or
 # consider defining MATEPLUS_HOME in your .profile, for example:
 #   export MATEPLUS_HOME=~/u/sandboxes/mateplus
-MATEPLUS_HOME ?= ~/u/tools/MATE/mateplus-master/src
+MATEPLUS_HOME ?= /home/loc982/u/tools/mateplus
 
 MATETOOLS_HOME ?= $(MATEPLUS_HOME)
 
@@ -47,6 +47,7 @@ LIBRARIES += -Wl,-Bstatic -lcmdlp -Wl,-Bdynamic
 PROG_NAMES := yisi
 TEST_NAMES := srlgraph_test maxmatching_test lexsim_test w2v_test biw2v_test \
 	      lexweight_test phrasesim_test srl_test srlutil_test util_test \
+	      emap_test oov_test ngram_test overlapvocab_test \
 	      yisiscorer_test testbin
 CMDLP_TEST_NAMES := cmdlp_test
 
diff --git a/src/emap_test.cpp b/src/emap_test.cpp
new file mode 100644
index 0000000..300b833
--- /dev/null
+++ b/src/emap_test.cpp
@@ -0,0 +1,74 @@
+/**
+ * @file lexsim_test.cpp
+ * @brief Unit test for lexsim.
+ *
+ * @author Jackie Lo
+ *
+ * Multilingual Text Processing / Traitement multilingue de textes
+ * Digital Technologies Research Centre / Centre de recherche en technologies numériques
+ * National Research Council Canada / Conseil national de recherches Canada
+ * Copyright 2019, Her Majesty in Right of Canada /
+ * Copyright 2019, Sa Majeste la Reine du Chef du Canada
+ */
+
+#include "lexsim.h"
+
+#include <iostream>
+#include <set>
+
+using namespace std;
+using namespace yisi;
+
+int main(int argc, char* argv[])
+{
+   string inpembpath = argv[1];
+   string hypembpath = argv[2];
+   string inpmappath = argv[3];
+   string inpdocpath = argv[4];
+   ofstream INPMAP;
+   ofstream INPDOC;
+   open_ofstream(INPMAP, inpmappath);
+
+   map<string, vector<double> > inpemb;
+   map<string, vector<double> > hypemb;
+   map<string, vector<double> > inpfilemb;
+   int dim;
+   read_binw2v(inpembpath, inpemb, dim);
+   read_binw2v(hypembpath, hypemb, dim);
+   vector<string> inpsents = read_file(inpdocpath);
+   //filter the inp emb according to the inp doc
+   set<string> tokens;
+   for (auto it = inpsents.begin(); it != inpsents.end(); it++) {
+      auto sent = tokenize(*it);
+      tokens.insert(sent.begin(), sent.end());
+   }
+   for (auto it = tokens.begin(); it != tokens.end(); it++) {
+      auto jt = inpemb.find(*it);
+      if (jt != inpemb.end()) {
+         inpfilemb[*it] = jt->second;
+      }
+   }
+
+   string maxsim_str;
+   double maxsim_scr=0.0;
+   for (auto it = inpfilemb.begin(); it != inpfilemb.end(); it++) {
+      auto inp_s = it->first;
+      auto inp_v = it->second;
+      for (auto jt = hypemb.begin(); jt != hypemb.end(); jt++) {
+         auto hyp_s = jt->first;
+         auto hyp_v = jt->second;
+         double sim = 0.0;
+         for (int i = 0; i < dim; i++) {
+            sim += inp_v[i] * hyp_v[i];
+         }
+         if (sim > maxsim_scr) {
+            maxsim_str = hyp_s;
+            maxsim_scr = sim;
+         }
+      }
+      INPMAP << inp_s << "\t" << maxsim_str << endl;
+      maxsim_scr = 0.0;
+   }
+   return 0;
+}
+
diff --git a/src/lexsim.cpp b/src/lexsim.cpp
index b8f69b0..f04e79e 100644
--- a/src/lexsim.cpp
+++ b/src/lexsim.cpp
@@ -45,15 +45,15 @@ double lexsimexact_t::get_sim(string ref, string hyp, int mode) {
 double lexsimlcs_t::get_sim(string ref, string hyp, int mode) {
    if (mode == yisi::INP_MODE) {
       cerr << "ERROR: longest common subsequence lex sim model is not defined "
-           << "in crosslingual settings. Exiting..." << endl;
+         << "in crosslingual settings. Exiting..." << endl;
       exit(1);
    }
    double lcs_n = 0.0;
    size_t ref_n = ref.length();
    size_t hyp_n = hyp.length();
    // find the length of the longest common character subsequence
-   for (size_t i = 0; i < ref_n - lcs_n - 1; i++) {
-      //cerr << "Current ref pos: " << i << endl;
+   for (size_t i = 0; i < ref_n - lcs_n; i++) {
+      // cerr << "Current ref pos: " << i << endl;
       size_t j;
       for (j = lcs_n + 1; j <= ref_n - i; j++) {
          //cerr << "Previous common length: " << lcs_n << endl;
@@ -214,51 +214,51 @@ void lexsimw2v_t::write_txtw2v(std::string path) {
    cerr << "Done." << endl;
 }
 
-lexsimemapw2v_t::lexsimemapw2v_t(string emap_path, string outw2v_path)
-: lexsimw2v_t(outw2v_path) {
-  cerr << "Reading emap model from " << emap_path << endl;
-  ifstream EMAP(emap_path.c_str());
-  if (!EMAP) {
-    cerr << "ERROR: fail to open ibm model. Exiting..." << endl;
-    exit(1);
-  }
-  while (!EMAP.eof()) {
-    string inp;
-    string hyp;
-    EMAP >> inp >> hyp;
-    emap_m[inp]=hyp;
-  }
-  EMAP.close();
-  cerr << "Finished reading." << endl;
+lexsimemapw2v_t::lexsimemapw2v_t(string emap_path, string outw2v_path) :
+   lexsimw2v_t(outw2v_path) {
+   cerr << "Reading emap model from " << emap_path << endl;
+   ifstream EMAP(emap_path.c_str());
+   if (!EMAP) {
+      cerr << "ERROR: fail to open emap model. Exiting..." << endl;
+      exit(1);
+   }
+   while (!EMAP.eof()) {
+      string inp;
+      string hyp;
+      EMAP >> inp >> hyp;
+      emap_m[inp] = hyp;
+   }
+   EMAP.close();
+   cerr << "Finished reading." << endl;
 }
 
 vector<double>& lexsimemapw2v_t::get_wv(string word, int mode) {
-  if (mode == yisi::INP_MODE){
-    if (emap_m.find(word) != emap_m.end()) {
-      word = emap_m[word];
-    } else if (emap_m.find(lowercase(word)) != emap_m.end()){
-      word = emap_m[lowercase(word)];
-    }
-  }
-  return yisi::get_wv(outembeddings_m, word);
+   if (mode == yisi::INP_MODE) {
+      if (emap_m.find(word) != emap_m.end()) {
+         word = emap_m[word];
+      } else if (emap_m.find(lowercase(word)) != emap_m.end()) {
+         word = emap_m[lowercase(word)];
+      }
+   }
+   return yisi::get_wv(outembeddings_m, word);
 }
 
 double lexsimemapw2v_t::get_sim(string s1, string hyp, int mode) {
-  if (lowercase(s1) == lowercase(hyp)){
-    return 1.0;
-  } else {
-    double result = this->get_sim(this->get_wv(s1, mode), this->get_wv(hyp, yisi::HYP_MODE));
-    //cerr << "(" << s1 << "," << hyp << "," << mode << "," << result << ")" << endl;
-    return result;
-  }
+   if (lowercase(s1) == lowercase(hyp)) {
+      return 1.0;
+   } else {
+      double result = this->get_sim(this->get_wv(s1, mode), this->get_wv(hyp, yisi::HYP_MODE));
+      //cerr << "(" << s1 << "," << hyp << "," << mode << "," << result << ")" << endl;
+      return result;
+   }
 }
 
 double lexsimemapw2v_t::get_sim(vector<double>& s1, vector<double>& hyp) {
-  if ((int)s1.size() == dimension_m && (int)hyp.size() == dimension_m) {
-    return yisi::get_sim(s1, hyp, func_m);
-  } else {
-    return 0.0;
-  }
+   if ((int)s1.size() == dimension_m && (int)hyp.size() == dimension_m) {
+      return yisi::get_sim(s1, hyp, func_m);
+   } else {
+      return 0.0;
+   }
 }
 
 lexsimbiw2v_t::lexsimbiw2v_t(string inpw2v_path, string outw2v_path)
@@ -307,6 +307,15 @@ double lexsimbiw2v_t::get_sim(vector<double>& s1, vector<double>& hyp) {
    }
 }
 
+double lexsimemb_t::get_sim(string s1, string hyp, int mode){
+  cerr <<"ERROR: lexsim model is a contextual embedding model, cannot compute lexsim without providing the embedding. Exiting..." << endl;
+  exit(1);
+}
+
+double lexsimemb_t::get_sim(vector<double>& s1, vector<double>& hyp){
+  return yisi::get_sim(s1, hyp, func_m);
+}
+
 lexsim_t::lexsim_t() {
    lexsim_p = new lexsimexact_t();
 }
@@ -324,6 +333,8 @@ lexsim_t::lexsim_t(string name, string out_path, string inp_path) {
      lexsim_p = new lexsimbiw2v_t(inp_path, out_path);
    } else if (name == "lcs") {
      lexsim_p = new lexsimlcs_t();
+   } else if (name == "emb"){
+     lexsim_p = new lexsimemb_t();
    } else {
      cerr << "ERROR: Unknown lexsim model type " << name << endl;
    }
@@ -345,6 +356,8 @@ lexsim_t::lexsim_t(lexsim_t& rhs) {
       lexsim_p = new lexsimbiw2v_t(rhs.inplexsim_path_m, rhs.outlexsim_path_m);
    } else if (rhs.lexsim_name_m == "lcs") {
       lexsim_p = new lexsimlcs_t();
+   } else if (rhs.lexsim_name_m == "emb") {
+     lexsim_p = new lexsimemb_t();
    }
    lexsim_name_m = rhs.lexsim_name_m;
    outlexsim_path_m = rhs.outlexsim_path_m;
@@ -404,7 +417,7 @@ void yisi::read_binw2v(string path, map<string, vector<double> >& model, int& di
    long long d = 0;
    char tmp;
 
-   cerr << "Reading w2v model from " << path << endl;
+   cerr << "Reading w2v binary model from " << path << endl;
    ifstream W2V(path.c_str(), ios::in | ios::binary);
    if (!W2V) {
       cerr << "ERROR: Failed to open w2v model. Exiting..." << endl;
diff --git a/src/lexsim.h b/src/lexsim.h
index f4ddb98..6362474 100644
--- a/src/lexsim.h
+++ b/src/lexsim.h
@@ -97,16 +97,28 @@ namespace yisi {
       int dimension_m;
    }; // class lexsimw2v_t
 
+   class lexsimemb_t:public lexsimmodel_t {
+   public:
+      lexsimemb_t() {
+         func_m = "cosine";
+      }
+      virtual ~lexsimemb_t() {}
+      virtual double get_sim(std::string ref, std::string hyp, int mode);
+      virtual double get_sim(std::vector<double>& ref, std::vector<double>& hyp);
+   protected:
+      std::string func_m;
+   }; // class lexsimw2v_t
+
    class lexsimemapw2v_t:public lexsimw2v_t {
    public:
-     lexsimemapw2v_t() {}
-     lexsimemapw2v_t(std::string emap_path, std::string outw2v_func);
-     virtual ~lexsimemapw2v_t() {}
-     std::vector<double>& get_wv(std::string word, int mode);
-     virtual double get_sim(std::string s1, std::string hyp, int mode);
-     virtual double get_sim(std::vector<double>& s1, std::vector<double>& hyp);
+      lexsimemapw2v_t() {}
+      lexsimemapw2v_t(std::string emap_path, std::string outw2v_func);
+      virtual ~lexsimemapw2v_t() {}
+      std::vector<double>& get_wv(std::string word, int mode);
+      virtual double get_sim(std::string s1, std::string hyp, int mode);
+      virtual double get_sim(std::vector<double>& s1, std::vector<double>& hyp);
    private:
-     std::map<std::string, std::string> emap_m;
+      std::map<std::string, std::string> emap_m;
    }; // class lexsimemapw2v_t
 
 //   class lexsimibm_t:public lexsimmodel_t {
diff --git a/src/ngram_test.cpp b/src/ngram_test.cpp
new file mode 100644
index 0000000..7a4b901
--- /dev/null
+++ b/src/ngram_test.cpp
@@ -0,0 +1,43 @@
+/**
+ * @file w2v_test.cpp
+ * @brief Unit test for w2v lexsim.
+ *
+ * @author Jackie Lo
+ *
+ * Multilingual Text Processing / Traitement multilingue de textes
+ * Digital Technologies Research Centre / Centre de recherche en technologies numériques
+ * National Research Council Canada / Conseil national de recherches Canada
+ * Copyright 2019, Her Majesty in Right of Canada /
+ * Copyright 2019, Sa Majeste la Reine du Chef du Canada
+ */
+
+#include "util.h"
+
+#include <iostream>
+#include <vector>
+#include <set>
+#include <string>
+
+using namespace std;
+using namespace yisi;
+
+int main(int argc, char* argv[])
+{
+   set<string> result;
+   while (!cin.eof()) {
+      string line;
+      cin >> line;
+      auto tokens = tokenize(line);
+      auto ngrams = collect_ngram(atoi(argv[1]), tokens);
+      for (auto it = ngrams.begin(); it != ngrams.end(); it++) {
+         auto ngram = join(*it);
+         result.insert(ngram);
+      }
+   }
+   for (auto it = result.begin(); it != result.end(); it++) {
+      cout << *it << endl;
+   }
+
+   return 0;
+}
+
diff --git a/src/oov_test.cpp b/src/oov_test.cpp
new file mode 100644
index 0000000..0c5c32c
--- /dev/null
+++ b/src/oov_test.cpp
@@ -0,0 +1,39 @@
+/**
+ * @file w2v_test.cpp
+ * @brief Unit test for w2v lexsim.
+ *
+ * @author Jackie Lo
+ *
+ * Multilingual Text Processing / Traitement multilingue de textes
+ * Digital Technologies Research Centre / Centre de recherche en technologies numériques
+ * National Research Council Canada / Conseil national de recherches Canada
+ * Copyright 2019, Her Majesty in Right of Canada /
+ * Copyright 2019, Sa Majeste la Reine du Chef du Canada
+ */
+
+#include "lexsim.h"
+
+#include <iostream>
+
+using namespace std;
+using namespace yisi;
+
+int main(int argc, char* argv[])
+{
+   lexsim_t w2vtxt("w2v", argv[1], "cosine");
+   string sent;
+
+   while(!cin.eof()){
+      getline(cin, sent);
+      //cout << sent << endl;
+      auto tokens = tokenize(sent);
+      for (auto it = tokens.begin(); it != tokens.end(); it++){
+         if ((w2vtxt.get_wv(*it,HYP_MODE)).size() == 0){
+            cout << *it << endl;
+         }
+      }
+   }
+
+   return 0;
+}
+
diff --git a/src/overlapvocab_test.cpp b/src/overlapvocab_test.cpp
new file mode 100644
index 0000000..b675406
--- /dev/null
+++ b/src/overlapvocab_test.cpp
@@ -0,0 +1,49 @@
+/**
+ * @file lexsim_test.cpp
+ * @brief Unit test for lexsim.
+ *
+ * @author Jackie Lo
+ *
+ * Multilingual Text Processing / Traitement multilingue de textes
+ * Digital Technologies Research Centre / Centre de recherche en technologies numériques
+ * National Research Council Canada / Conseil national de recherches Canada
+ * Copyright 2019, Her Majesty in Right of Canada /
+ * Copyright 2019, Sa Majeste la Reine du Chef du Canada
+ */
+
+#include "lexsim.h"
+
+#include <iostream>
+
+using namespace std;
+using namespace yisi;
+
+int main(int argc, char* argv[])
+{
+   string inpembpath = argv[1];
+   string hypembpath = argv[2];
+   map<string, vector<double> > inpemb;
+   map<string, vector<double> > hypemb;
+   int dim;
+   read_binw2v(inpembpath, inpemb, dim);
+   read_binw2v(hypembpath, hypemb, dim);
+
+   auto it = inpemb.begin();
+   auto jt = hypemb.begin();
+   while (it != inpemb.end() && jt != hypemb.end()) {
+      string inp = it->first;
+      string hyp = jt->first;
+      if (inp == hyp) {
+         cout << inp << " " << hyp << endl;
+         it++;
+         jt++;
+      } else if (inp.compare(hyp) < 0) {
+         it++;
+      } else {
+         jt++;
+      }
+   }
+
+   return 0;
+}
+
diff --git a/src/phrasesim.h b/src/phrasesim.h
index a092c14..635a8ca 100644
--- a/src/phrasesim.h
+++ b/src/phrasesim.h
@@ -51,68 +51,68 @@ namespace yisi {
          using namespace com::masaers::cmdlp;
 
          p.add(make_knob(lexsim_name_m))
-            .fallback("exact")
-            .desc("Type of lex sim model: [exact(default)|ibm1|w2v|ibmw2v]")
-            .name("lexsim-type")
-            ;
+               .fallback("exact")
+               .desc("Type of lex sim model: [exact(default)|ibm1|w2v|ibmw2v]")
+               .name("lexsim-type")
+               ;
          p.add(make_knob(outlexsim_path_m))
-            .fallback("")
-            .desc("Path to lex sim model file in output language")
-            .name("outlexsim-path")
-            ;
+               .fallback("")
+               .desc("Path to lex sim model file in output language")
+               .name("outlexsim-path")
+               ;
          p.add(make_knob(inplexsim_path_m))
-            .fallback("")
-            .desc("Path to lex sim model file in input language")
-            .name("inplexsim-path")
-            ;
+               .fallback("")
+               .desc("Path to lex sim model file in input language")
+               .name("inplexsim-path")
+               ;
          p.add(make_knob(inplexweight_name_m))
-            .fallback("uniform")
-            .desc("Type of input lex weight model: [uniform(default)|file|learn]")
-            .name("inplexweight-type")
-            ;
+               .fallback("uniform")
+               .desc("Type of input lex weight model: [uniform(default)|file|learn]")
+               .name("inplexweight-type")
+               ;
          p.add(make_knob(inplexweight_path_m))
-            .fallback("")
-            .desc("[file: path to input lex weight model file "
-                  "| learn: monolingual corpus in input language to learn]")
-            .name("inplexweight-path")
-            ;
+               .fallback("")
+               .desc("[file: path to input lex weight model file "
+                     "| learn: monolingual corpus in input language to learn]")
+               .name("inplexweight-path")
+               ;
          p.add(make_knob(reflexweight_name_m))
-            .fallback("uniform")
-            .desc("Type of reference lex weight model: [uniform(default)|file|learn]")
-            .name("lexweight-type")
-            .name("reflexweight-type")
-            ;
+               .fallback("uniform")
+               .desc("Type of reference lex weight model: [uniform(default)|file|learn]")
+               .name("lexweight-type")
+               .name("reflexweight-type")
+               ;
          p.add(make_knob(reflexweight_path_m))
-            .fallback("")
-            .desc("[file: path to reference lex weight model file "
-                  "| learn: monolingual corpus in reference language to learn]")
-            .name("lexweight-path")
-            .name("reflexweight-path")
-            ;
+               .fallback("")
+               .desc("[file: path to reference lex weight model file "
+                     "| learn: monolingual corpus in reference language to learn]")
+               .name("lexweight-path")
+               .name("reflexweight-path")
+               ;
          p.add(make_knob(hyplexweight_name_m))
-            .fallback("")
-            .desc("Type of hypotheses lex weight model: [uniform|file|learn] "
-                  "(default: same as reflexweight-type")
-            .name("hyplexweight-type")
-            ;
+               .fallback("")
+               .desc("Type of hypotheses lex weight model: [uniform|file|learn] "
+                     "(default: same as reflexweight-type")
+               .name("hyplexweight-type")
+               ;
          p.add(make_knob(hyplexweight_path_m))
-            .fallback("")
-            .desc("[file: path to hypotheses lex weight model file "
-                  "| learn: monolingual corpus in hypothesis language to learn]")
-            .name("hyplexweight-path")
-            ;
+               .fallback("")
+               .desc("[file: path to hypotheses lex weight model file "
+                     "| learn: monolingual corpus in hypothesis language to learn]")
+               .name("hyplexweight-path")
+               ;
          p.add(make_knob(phrasesim_name_m))
-            .fallback("nwpr")
-            .desc("Type of phrase sim model: [nwpf: n-gram idf-weighted precision/recall]")
-            .name("psname")
-            .name("phrasesim-type")
-            ;
+               .fallback("nwpr")
+               .desc("Type of phrase sim model: [nwpf: n-gram idf-weighted precision/recall]")
+               .name("psname")
+               .name("phrasesim-type")
+               ;
          p.add(make_knob(n_m))
-            .fallback(0)
-            .desc("N-gram size")
-            .name("ngram-size")
-            .name("n")
-            ;
+               .fallback(0)
+               .desc("N-gram size")
+               .name("ngram-size")
+               .name("n")
+               ;
       }
    }; // struct phrasesim_options
 
@@ -245,7 +245,56 @@ namespace yisi {
          } else {
             mpscache_m[s1txt][hyptxt] = s;
          }
-         //std::cerr << "(" << s1txt << " ||| " << hyptxt << " ||| " << s.first << "," << s.second << ")" << std::endl;
+         return s;
+      };
+
+      std::pair<double, double> operator()(std::vector<std::string> s1tokens,
+                                           std::vector<std::string>& hyptokens,
+                                           std::vector<std::vector<double> > s1embs,
+                                           std::vector<std::vector<double> > hypembs, int mode) {
+         std::pair<double, double> result;
+         if (s1tokens.size() == 0 || hyptokens.size() == 0) {
+            result = std::make_pair(0.0, 0.0);
+            return result;
+         }
+         std::string s1txt;
+         size_t i;
+         for (i = 0; i < s1tokens.size() - 1; i++) {
+            s1txt = s1txt + s1tokens[i] + " ";
+         }
+         s1txt = s1txt + s1tokens[i];
+         std::string hyptxt;
+         size_t j;
+         for (j = 0; j < hyptokens.size() - 1; j++) {
+            hyptxt = hyptxt + hyptokens[j] + " ";
+         }
+         hyptxt = hyptxt + hyptokens[j];
+
+         if (mode == yisi::INP_MODE) {
+            if (xpscache_m.find(s1txt) != xpscache_m.end()) {
+               if (xpscache_m[s1txt].find(hyptxt) != xpscache_m[s1txt].end()) {
+                  return xpscache_m[s1txt][hyptxt];
+               }
+            } else {
+               std::map<std::string, std::pair<double, double> > c;
+               xpscache_m[s1txt] = c;
+            }
+         } else {
+            if (mpscache_m.find(s1txt) != mpscache_m.end()) {
+               if (mpscache_m[s1txt].find(hyptxt) != mpscache_m[s1txt].end()) {
+                  return mpscache_m[s1txt][hyptxt];
+               }
+            } else {
+               std::map<std::string, std::pair<double, double> > c;
+               mpscache_m[s1txt] = c;
+            }
+         }
+         auto s = nwpr(s1tokens, hyptokens, s1embs, hypembs, mode);
+         if (mode == yisi::INP_MODE) {
+            xpscache_m[s1txt][hyptxt] = s;
+         } else {
+            mpscache_m[s1txt][hyptxt] = s;
+         }
          return s;
       };
 
@@ -255,7 +304,7 @@ namespace yisi {
          //std::cerr<<"ng: " << hyptokens.size()<<std::endl;
          if (s1tokens.size() != hyptokens.size()) {
             std::cerr << "ERROR: Failed to compute n-gram similarity - "
-                      << "s1 n-gram size != hyp n-gram size. Exiting..." << std::endl;
+               << "s1 n-gram size != hyp n-gram size. Exiting..." << std::endl;
             exit(1);
          }
          double presult = 0.0;
@@ -272,7 +321,7 @@ namespace yisi {
                rw = (*reflexweight_p)(s1tokens[i]);
             }
             pw = (*hyplexweight_p)(hyptokens[i]);
-            //std::cerr << s1tokens[i] << " ||| " << hyptokens[i];
+            //std::cerr << s1tokens[i] << " ||| " << hyptokens[i] <<" ||| "<<rw <<" ||| " <<pw << " ||| ";
             ls = lexsim_p->get_sim(s1tokens[i], hyptokens[i], mode);
             //std::cerr << ls << std::endl;
             rresult += rw * ls;
@@ -280,6 +329,45 @@ namespace yisi {
             rlen += rw;
             plen += pw;
          }
+         //std::cerr<<"(" << presult / plen<<","<<rresult / rlen<< ")"<<std::endl;
+         std::pair<double, double> result = std::make_pair(presult / plen, rresult / rlen);
+         return result;
+      }
+
+      std::pair<double, double> ngram(std::vector<std::string>& s1tokens,
+                                      std::vector<std::string>& hyptokens,
+                                      std::vector<std::vector<double> > s1embs,
+                                      std::vector<std::vector<double> > hypembs,
+                                      int mode) {
+         //std::cerr<<"ng: " << s1tokens.size()<<std::endl;
+         //std::cerr<<"ng: " << hyptokens.size()<<std::endl;
+         if (s1tokens.size() != hyptokens.size()) {
+            std::cerr << "ERROR: Failed to compute n-gram similarity - "
+               << "s1 n-gram size != hyp n-gram size. Exiting..." << std::endl;
+            exit(1);
+         }
+         double presult = 0.0;
+         double rresult = 0.0;
+         double plen = 0.0;
+         double rlen = 0.0;
+         for (size_t i = 0; i < s1tokens.size(); i++) {
+            double rw = 0.0;
+            double pw = 0.0;
+            double ls = 0.0;
+            if (mode == yisi::INP_MODE) {
+               rw = (*inplexweight_p)(s1tokens[i]);
+            } else if (mode == yisi::REF_MODE) {
+               rw = (*reflexweight_p)(s1tokens[i]);
+            }
+            pw = (*hyplexweight_p)(hyptokens[i]);
+            //std::cerr << s1tokens[i] << " ||| " << hyptokens[i];
+            ls = lexsim_p->get_sim(s1embs[i], hypembs[i]);
+            //std::cerr << ls << std::endl;
+            rresult += rw * ls;
+            presult += pw * ls;
+            rlen += rw;
+            plen += pw;
+         }
          std::pair<double, double> result = std::make_pair(presult / plen, rresult / rlen);
          return result;
       }
@@ -300,71 +388,93 @@ namespace yisi {
 
       std::pair<double, double> nwpr(std::vector<std::string>& s1tokens,
                                      std::vector<std::string>& hyptokens, int mode) {
-         std::string s1txt = yisi::join(s1tokens);
-         std::string hyptxt = yisi::join(hyptokens);
-         //std::cerr << s1txt << std::endl;
-         //std::cerr<<hyptxt<<std::endl;
-
          std::vector<std::vector<std::string> > s1ngrams;
          std::vector<std::vector<std::string> > hypngrams;
 
          if ((int)s1tokens.size() < n_m || (int)hyptokens.size() < n_m) {
-	   s1ngrams = yisi::collect_ngram(std::min(s1tokens.size(), hyptokens.size()), s1tokens);
-           hypngrams = yisi::collect_ngram(std::min(s1tokens.size(), hyptokens.size()), hyptokens);
+            s1ngrams = yisi::collect_ngram(std::min(s1tokens.size(), hyptokens.size()), s1tokens);
+            hypngrams = yisi::collect_ngram(std::min(s1tokens.size(), hyptokens.size()), hyptokens);
          } else {
-           s1ngrams = yisi::collect_ngram(n_m, s1tokens);
-           hypngrams = yisi::collect_ngram(n_m, hyptokens);
+            s1ngrams = yisi::collect_ngram(n_m, s1tokens);
+            hypngrams = yisi::collect_ngram(n_m, hyptokens);
          }
-         //std::cerr << s1ngrams.size() << std::endl;
-         //std::cerr << hypngrams.size()<<std::endl;
          double nom = 0.0;
          double denom = 0.0;
 
          for (size_t ii = 0; ii < s1ngrams.size(); ii++) {
-            std::string s1ngtxt = yisi::join(s1ngrams[ii]);
-
             double sim = 0.0;
             double rw = ngramlw(s1ngrams[ii], mode);
 
             for (size_t jj = 0; jj < hypngrams.size(); jj++) {
-               std::string hypngtxt = yisi::join(hypngrams[jj]);
-               //std::cerr << "ng sim of " << s1ngtxt << "," << hypngtxt << std::endl;
                sim = std::fmax(sim, ngram(s1ngrams[ii], hypngrams[jj], mode).second);
             }
             nom += rw * sim;
             denom += rw;
          }
          double recall = nom / denom;
-         //std::cerr << "nnwr: " << recall << std::endl;
-         //if (a >= 1) {
-         //   return recall;
-         //}
-
          nom = 0.0;
          denom = 0.0;
-         //std::cerr<<hypngrams.size()<<std::endl;
-         //std::cerr<<refngrams.size()<<std::endl;
          for (size_t iii = 0; iii < hypngrams.size(); iii++) {
             double hs = 0.0;
             double hw = ngramlw(hypngrams[iii], yisi::HYP_MODE);
-            //std::cerr <<"here1"<<std::endl;
             for (size_t jjj = 0; jjj < s1ngrams.size(); jjj++) {
-               //std::cerr<<"here2"<<std::endl;
                hs = std::fmax(hs, ngram(s1ngrams[jjj], hypngrams[iii], mode).first);
             }
             nom += hw * hs;
             denom += hw;
          }
          double precision = nom / denom;
-         //std::cerr << "nnwp: " << precision << std::endl;
-         //if (a<=0){
-         //   return precision;
-         //}
-         //if ((a*precision+(1-a)*recall) > 0.0){
-         //   return (precision*recall)/(a*precision+(1-a)*recall);
-         //} else {
-         //   return 0.0;
-         //}
+         std::pair<double, double> result = std::make_pair(precision, recall);
+         return result;
+      }
+
+      std::pair<double, double> nwpr(std::vector<std::string>& s1tokens,
+                                     std::vector<std::string>& hyptokens,
+                                     std::vector<std::vector<double> > s1embs,
+                                     std::vector<std::vector<double> > hypembs,
+                                     int mode) {
+         std::vector<std::vector<std::string> > s1ngrams;
+         std::vector<std::vector<std::string> > hypngrams;
+         std::vector<std::vector<std::vector<double> > > s1embngrams;
+         std::vector<std::vector<std::vector<double> > > hypembngrams;
+
+         if ((int)s1tokens.size() < n_m || (int)hyptokens.size() < n_m) {
+            s1ngrams = yisi::collect_ngram(std::min(s1tokens.size(), hyptokens.size()), s1tokens);
+            hypngrams = yisi::collect_ngram(std::min(s1tokens.size(), hyptokens.size()), hyptokens);
+            s1embngrams = yisi::collect_ngram(std::min(s1tokens.size(), hyptokens.size()), s1embs);
+            hypembngrams = yisi::collect_ngram(std::min(s1tokens.size(), hyptokens.size()), hypembs);
+         } else {
+            s1ngrams = yisi::collect_ngram(n_m, s1tokens);
+            hypngrams = yisi::collect_ngram(n_m, hyptokens);
+            s1embngrams = yisi::collect_ngram(n_m, s1embs);
+            hypembngrams = yisi::collect_ngram(n_m, hypembs);
+         }
+         double nom = 0.0;
+         double denom = 0.0;
+
+         for (size_t ii = 0; ii < s1ngrams.size(); ii++) {
+            double sim = 0.0;
+            double rw = ngramlw(s1ngrams[ii], mode);
+
+            for (size_t jj = 0; jj < hypngrams.size(); jj++) {
+               sim = std::fmax(sim, ngram(s1ngrams[ii], hypngrams[jj], s1embngrams[ii], hypembngrams[jj], mode).second);
+            }
+            nom += rw * sim;
+            denom += rw;
+         }
+         double recall = nom / denom;
+         nom = 0.0;
+         denom = 0.0;
+         for (size_t iii = 0; iii < hypngrams.size(); iii++) {
+            double hs = 0.0;
+            double hw = ngramlw(hypngrams[iii], yisi::HYP_MODE);
+            for (size_t jjj = 0; jjj < s1ngrams.size(); jjj++) {
+               hs = std::fmax(hs, ngram(s1ngrams[jjj], hypngrams[iii], s1embngrams[jjj], hypembngrams[iii], mode).first);
+            }
+            nom += hw * hs;
+            denom += hw;
+         }
+         double precision = nom / denom;
          std::pair<double, double> result = std::make_pair(precision, recall);
          return result;
       }
diff --git a/src/sent.cpp b/src/sent.cpp
new file mode 100644
index 0000000..93ddda3
--- /dev/null
+++ b/src/sent.cpp
@@ -0,0 +1,248 @@
+/**
+ * @file sent.cpp
+ * @brief Sentence
+ *
+ * @author Jackie Lo
+ *
+ * Class implementation for the classes:
+ *    - sent_t
+ * and the definitions of some utility functions working on it.
+ *
+ * Multilingual Text Processing / Traitement multilingue de textes
+ * Digital Technologies Research Centre / Centre de recherche en technologies numériques
+ * National Research Council Canada / Conseil national de recherches Canada
+ * Copyright 2019, Her Majesty in Right of Canada /
+ * Copyright 2019, Sa Majeste la Reine du Chef du Canada
+ */
+
+#include "sent.h"
+
+#include <fstream>
+#include <sstream>
+#include <math.h>
+
+using namespace yisi;
+using namespace std;
+
+sent_t::sent_t() {
+   sent_type_m = "word";
+}
+
+sent_t::sent_t(string sent_type) {
+   sent_type_m = sent_type;
+}
+
+sent_t::sent_t(const sent_t& rhs) {
+   sent_type_m = rhs.sent_type_m;
+   token_m = rhs.token_m;
+   unit_m = rhs.unit_m;
+   emb_m = rhs.emb_m;
+   tid2uspan_m = rhs.tid2uspan_m;
+   uid2tid_m = rhs.uid2tid_m;
+}
+
+void sent_t::operator=(const sent_t& rhs) {
+   sent_type_m = rhs.sent_type_m;
+   token_m = rhs.token_m;
+   unit_m = rhs.unit_m;
+   emb_m = rhs.emb_m;
+   tid2uspan_m = rhs.tid2uspan_m;
+   uid2tid_m = rhs.uid2tid_m;
+}
+
+string sent_t::get_type() {
+   return sent_type_m;
+}
+
+vector<string> sent_t::get_tokens(span_type tspan) {
+   vector<string> result;
+   for (size_t i = tspan.first; i < tspan.second; i++) {
+      result.push_back(token_m[i]);
+   }
+   /*
+   cerr << "In get_tokens(" << tspan.first << "," << tspan.second << "): ";
+   for (auto it = result.begin(); it != result.end(); it++) {
+      cerr << *it << " ";
+   }
+   cerr << endl;
+   */
+   return result;
+}
+
+vector<string> sent_t::get_tokens() {
+   // cerr << "In get_tokens(): " << endl;
+   return token_m;
+}
+
+vector<string> sent_t::get_units(span_type uspan) {
+   vector<string> result;
+   if (sent_type_m == "word") {
+      for (size_t i = uspan.first; i < uspan.second; i++) {
+         result.push_back(token_m[i]);
+      }
+   } else {
+      for (size_t i = uspan.first; i < uspan.second; i++) {
+         if (i < unit_m.size()) {
+            result.push_back(unit_m[i]);
+         }
+      }
+   }
+   return result;
+}
+
+vector<vector<double> > sent_t::get_embs(span_type uspan) {
+   if (sent_type_m == "uemb") {
+      vector<vector<double> > result;
+      for (size_t i = uspan.first; i < uspan.second; i++) {
+         result.push_back(emb_m[i]);
+      }
+      return result;
+   } else {
+      cerr << "ERROR: sentence type (" << sent_type_m << ") "
+           << "does not provide contextual embeddings. Exiting..." << endl;
+      exit(1);
+   }
+}
+
+sent_t::span_type sent_t::tspan2uspan(span_type tspan) {
+   if (sent_type_m == "word") {
+      return tspan;
+   } else {
+      //cerr << tid2uspan_m.size();
+      if (tspan.first < tid2uspan_m.size() && (tspan.second-1) < tid2uspan_m.size()) {
+         return span_type(tid2uspan_m[tspan.first].first, tid2uspan_m[tspan.second-1].second);
+      } else {
+         return tspan;
+      }
+   }
+}
+
+sent_t::span_type sent_t::uspan2tspan(span_type uspan) {
+   if (sent_type_m == "word") {
+      return uspan;
+   } else {
+      return span_type(uid2tid_m[uspan.first], uid2tid_m[uspan.second-1]);
+   }
+}
+
+void sent_t::set_tokens(vector<string> t) {
+   token_m = t;
+   /*
+   cerr << "In set_tokens(t): ";
+   for (auto it = token_m.begin(); it != token_m.end(); it++) {
+      cerr << *it << " ";
+   }
+   cerr << endl;
+   */
+}
+
+void sent_t::set_units(vector<string> u ) {
+   unit_m = u;
+}
+
+void sent_t::set_embs(vector<vector<double> > e) {
+   emb_m = e;
+}
+
+void sent_t::set_tid2uspan(vector<span_type> t2u) {
+   tid2uspan_m = t2u;
+}
+
+void sent_t::set_uid2tid(vector<size_t> u2t) {
+   uid2tid_m = u2t;
+}
+
+size_t sent_t::get_token_size() {
+   return token_m.size();
+}
+
+vector<sent_t*> yisi::read_sent(string sent_type, string token_path, string unit_path, string idemb_path) {
+   vector<sent_t*> result;
+   vector<vector<double> > emb;
+   vector<sent_t::span_type> t2u;
+   vector<size_t> u2t;
+   size_t currtid = (size_t)-1;
+
+   //cerr << token_path << " ";
+   auto token_strs = read_file(token_path);
+   if (unit_path == "") {
+      for (auto tt = token_strs.begin(); tt != token_strs.end(); tt++) {
+         sent_t* sent_p = new sent_t(sent_type);
+         sent_p->set_tokens(tokenize(*tt));
+         //cerr << "sentlength=" << sent.get_token_size();
+         result.push_back(sent_p);
+         //cerr << " Done." << endl;
+      }
+
+   } else {
+      // cerr << unit_path << " ";
+      auto unit_strs = read_file(unit_path);
+      auto tt = token_strs.begin();
+      auto ut = unit_strs.begin();
+      //cerr << idemb_path << " ";
+      ifstream fin(idemb_path.c_str());
+      if (!fin) {
+         cerr << "ERROR: Failed to open idemb file (" << idemb_path << "). Exiting..." << endl;
+         exit(1);
+      }
+      while (!fin.eof()) {
+         string line;
+         getline(fin, line);
+         if (line.empty()) {
+            sent_t* s = new sent_t(sent_type);
+            auto tokens = tokenize(*tt);
+            s->set_tokens(tokens);
+            //cerr << "#token=" << tokens.size();
+            auto units = tokenize(*ut);
+            s->set_units(units);
+            //cerr << " #unit=" << units.size();
+            if (sent_type == "uemb") {
+               s->set_embs(emb);
+            }
+            //cerr << " #emb=" << emb.size() << " #dim=" << emb[0].size();
+            s->set_tid2uspan(t2u);
+            //cerr << " #tid=" << t2u.size();
+            s->set_uid2tid(u2t);
+            //cerr << " #uid=" << u2t.size() << " ";
+
+            result.push_back(s);
+            tt++;
+            ut++;
+            emb.clear();
+            t2u.clear();
+            u2t.clear();
+            currtid = (size_t)-1;
+         } else {
+            istringstream iss(line);
+            size_t uid;
+            size_t tid;
+            iss >> uid >> tid;
+            u2t.push_back(tid);
+            if (tid != currtid) {
+               t2u.push_back(sent_t::span_type(uid,uid+1));
+               currtid=tid;
+            } else {
+               t2u.back().second=uid+1;
+            }
+            if (sent_type == "uemb") {
+               vector<double> e;
+               double len = 0.0;
+               double v;
+               while (!iss.eof()) {
+                  iss >> v;
+                  e.push_back(v);
+                  len += v*v;
+               }
+               len = sqrt(len);
+               for (size_t i = 0; i < e.size(); i++) {
+                  e[i] /= len;
+               }
+               emb.push_back(e);
+            }
+         }
+         fin.peek();
+      }
+      fin.close();
+   }
+   return result;
+}
diff --git a/src/sent.h b/src/sent.h
new file mode 100644
index 0000000..c688410
--- /dev/null
+++ b/src/sent.h
@@ -0,0 +1,65 @@
+/**
+ * @file sent.h
+ * @brief Sentence
+ *
+ * @author Jackie Lo
+ *
+ * Class definition of sentence classes:
+ *    - sent_t
+ * and the declaration of some utility functions working on it.
+ *
+ * Multilingual Text Processing / Traitement multilingue de textes
+ * Digital Technologies Research Centre / Centre de recherche en technologies numériques
+ * National Research Council Canada / Conseil national de recherches Canada
+ * Copyright 2019, Her Majesty in Right of Canada /
+ * Copyright 2019, Sa Majeste la Reine du Chef du Canada
+ */
+
+#ifndef SENT_H
+#define SENT_H
+
+#include "util.h"
+
+#include <utility>
+#include <string>
+#include <vector> 
+#include <map>
+#include <iostream>
+
+namespace yisi {
+
+   class sent_t {
+   public:
+      typedef std::pair<size_t, size_t> span_type;
+      sent_t();
+      sent_t(std::string sent_type);
+      sent_t(const sent_t& rhs);
+      void operator=(const sent_t& rhs);
+      ~sent_t() {};
+      std::string get_type();
+      std::vector<std::string> get_tokens(span_type tspan);
+      std::vector<std::string> get_tokens();
+      std::vector<std::string> get_units(span_type uspan);
+      std::vector<std::vector<double> > get_embs(span_type uspan);
+      void set_tokens(std::vector<std::string> t);
+      void set_units(std::vector<std::string> u);
+      void set_embs(std::vector<std::vector<double> > e);
+      void set_tid2uspan(std::vector<span_type> t2u);
+      void set_uid2tid(std::vector<size_t> u2t);
+      span_type tspan2uspan(span_type tspan);
+      span_type uspan2tspan(span_type uspan);
+      size_t get_token_size();
+   private:
+      std::string sent_type_m;
+      std::vector<std::string> token_m;
+      std::vector<std::string> unit_m;
+      std::vector<std::vector<double> > emb_m;
+      std::vector<span_type> tid2uspan_m;
+      std::vector<size_t> uid2tid_m;
+   }; // class sent_t
+
+   std::vector<sent_t*> read_sent(std::string sent_type, std::string token_path, std::string unit_path="", std::string idemb_path="");
+
+} // yisi
+
+#endif
diff --git a/src/srl.cpp b/src/srl.cpp
index a456687..59c54ef 100644
--- a/src/srl.cpp
+++ b/src/srl.cpp
@@ -51,10 +51,10 @@ srl_t::~srl_t() {
    }
 }
 
-srlgraph_t srl_t::parse(string sent) {
+srlgraph_t srl_t::parse(sent_t* sent) {
    return srl_p->parse(sent);
 }
 
-vector<srlgraph_t> srl_t::parse(vector<string> sents) {
+vector<srlgraph_t> srl_t::parse(vector<sent_t*> sents) {
    return srl_p->parse(sents);
 }
diff --git a/src/srl.h b/src/srl.h
index 079b0c0..fff746a 100644
--- a/src/srl.h
+++ b/src/srl.h
@@ -30,8 +30,8 @@ namespace yisi {
       srl_t();
       srl_t(const std::string name, const std::string path="");
       ~srl_t();
-      srlgraph_t parse(std::string sent);
-      std::vector<srlgraph_t> parse(std::vector<std::string> sents);
+      srlgraph_t parse(sent_t* sent);
+      std::vector<srlgraph_t> parse(std::vector<sent_t*> sents);
    private:
       srlmodel_t* srl_p;
    }; // class srl_t
diff --git a/src/srl_test.cpp b/src/srl_test.cpp
index 432ec4d..997e571 100644
--- a/src/srl_test.cpp
+++ b/src/srl_test.cpp
@@ -27,21 +27,7 @@ int main(const int argc, const char* argv[])
    if (argc == 1) {
       srl_t mate("mate", "parse_full_es.sh");
 
-      vector<string> sents;
-
-      ifstream IN("test_es.txt");
-      if (IN.fail() or IN.bad()) {
-         cerr << "ERROR: Failed to open: test_es.txt. Exiting..." << endl;
-         exit(1);
-      }
-
-      while (!IN.eof()) {
-         string line;
-         getline(IN, line);
-         if (line != "") {
-            sents.push_back(line);
-         }
-      }
+      vector<sent_t*> sents = read_sent("word", "test_es.txt");
 
       auto r = mate.parse(sents);
       cout << "Done parsing " << r.size() << " srlgraphs." << endl;
@@ -52,22 +38,7 @@ int main(const int argc, const char* argv[])
 
    } else {
       srl_t parser(argv[1], argv[2]);
-      vector<string> sents;
-
-      ifstream IN(argv[3]);
-      if (IN.fail() or IN.bad()) {
-         cerr << "ERROR: Failed to open:" << argv[3] << ". Exiting..." << endl;
-         exit(1);
-      }
-
-      while (!IN.eof()) {
-         string line;
-         getline(IN, line);
-         if (line != "") {
-            sents.push_back(line);
-         }
-      }
-      IN.close();
+      vector<sent_t*> sents = read_sent("word", string(argv[3]));
 
       auto r = parser.parse(sents);
       cout << "Done parsing " << r.size() << " srlgraphs." << endl;
diff --git a/src/srlgraph.cpp b/src/srlgraph.cpp
index 2ff72fb..b805d86 100644
--- a/src/srlgraph.cpp
+++ b/src/srlgraph.cpp
@@ -27,22 +27,22 @@ using namespace std;
 srlgraph_t::srlgraph_t() {
 }
 
-srlgraph_t::srlgraph_t(vector<string>& tokens) {
-   span_type r(0, tokens.size());
+srlgraph_t::srlgraph_t(sent_t* sent) {
+   span_type r(0, sent->get_token_size());
    root_m = srl_m.new_node(r);
-   tokens_m = tokens;
+   sent_p = sent;
 }
 
 srlgraph_t::srlgraph_t(const srlgraph_t& rhs) {
    srl_m = rhs.srl_m;
-   tokens_m = rhs.tokens_m;
+   sent_p = rhs.sent_p;
    root_m = rhs.root_m;
    predof_m = predof_m;
 }
 
 void srlgraph_t::operator=(const srlgraph_t& rhs) {
    srl_m = rhs.srl_m;
-   tokens_m = rhs.tokens_m;
+   sent_p = rhs.sent_p;
    root_m = rhs.root_m;
    predof_m = predof_m;
 }
@@ -53,10 +53,10 @@ srlgraph_t::srlnid_type srlgraph_t::new_root() {
    return root_m;
 }
 
-srlgraph_t::srlnid_type srlgraph_t::new_root(vector<string>& tokens) {
-   span_type span(0, tokens.size());
+srlgraph_t::srlnid_type srlgraph_t::new_root(sent_t* sent) {
+   span_type span(0, sent_p->get_token_size());
    root_m = srl_m.new_node(span);
-   tokens_m = tokens;
+   sent_p = sent;
    return root_m;
 }
 
@@ -112,22 +112,30 @@ srlgraph_t::srlnid_type srlgraph_t::get_pred(srlnid_type argid) {
    return predof_m[argid];
 }
 
-vector<string>& srlgraph_t::get_sentence() {
-   return tokens_m;
+
+vector<string> srlgraph_t::get_sentence() {
+   return sent_p->get_tokens();
 }
 
-vector<string> srlgraph_t::get_role_fillers(srlnid_type roleid) {
-   vector<string> fillers;
+vector<string> srlgraph_t::get_role_filler_units(srlnid_type roleid) {
+   //vector<string> fillers;
    span_type span = srl_m.get_node_data(roleid);
+   //cerr<<span.first<<" "<<span.second;
+   /*
+  size_t span_begin = span.first;
+  size_t span_end = span.second;
 
-   size_t span_begin = span.first;
-   size_t span_end = span.second;
-
-   for (size_t i = span_begin; i < span_end; i++) {
-      fillers.push_back(tokens_m.at(i));
-   }
+  for (size_t i = span_begin; i < span_end; i++) {
+    fillers.push_back(tokens_m.at(i));
+  }
+    */
+   return sent_p->get_units(sent_p->tspan2uspan(span));
+   //return fillers;
+}
 
-   return fillers;
+vector<vector<double> > srlgraph_t::get_role_filler_embs(srlnid_type roleid) {
+   span_type span = srl_m.get_node_data(roleid);
+   return sent_p->get_embs(sent_p->tspan2uspan(span));
 }
 
 srlgraph_t::label_type srlgraph_t::get_role_label(srlnid_type roleid) {
@@ -138,10 +146,23 @@ srlgraph_t::span_type srlgraph_t::get_role_span(srlnid_type roleid) {
    return srl_m.get_node_data(roleid);
 }
 
+size_t srlgraph_t::get_sent_length() {
+   return sent_p->get_token_size();
+}
+
+
 void srlgraph_t::set_tokens(vector<string>& tokens) {
+   //cerr<<"Setting new tokens...";
    span_type r(0, tokens.size());
    srl_m.set_node_data(root_m, r);
-   tokens_m = tokens;
+   sent_p->set_tokens(tokens);
+   //cerr << "Done"<<endl;
+}
+
+void srlgraph_t::set_sent(sent_t* sent) {
+   span_type r(0, sent->get_token_size());
+   srl_m.set_node_data(root_m, r);
+   sent_p = sent;
 }
 
 void srlgraph_t::set_role_span(srlnid_type roleid, span_type& span) {
@@ -152,14 +173,17 @@ void srlgraph_t::set_role_label(srlnid_type roleid, label_type& label) {
    srl_m.set_edge_label(srl_m.get_outgoing_edges(roleid).at(0), label);
 } 
 
-
+void srlgraph_t::delete_sent() {
+   //DO NOT USE UNLESS read_conll09batch(parsefile) is called to create list of srlgraphs
+   delete sent_p;
+   sent_p = NULL;
+}
 
 ostream& srlgraph_t::operator<<(ostream& os) {
    vector<srlnid_type> preds = get_preds();
    if (preds.size() > 0) {
-      for (vector<srlnid_type>::iterator it = preds.begin(); it != preds.end();
-         it++) {
-         vector<string> frame_tokens = tokens_m;
+      for (vector<srlnid_type>::iterator it = preds.begin(); it != preds.end(); it++) {
+         vector<string> frame_tokens = sent_p->get_tokens(get_role_span(root_m));
          span_type pred_span = get_role_span(*it);
          if (pred_span.first != pred_span.second) {
             frame_tokens[pred_span.first] = "[" + get_role_label(*it) + " "
@@ -179,7 +203,8 @@ ostream& srlgraph_t::operator<<(ostream& os) {
          }
       }
    } else {
-      for (vector<string>::iterator it = tokens_m.begin(); it != tokens_m.end(); it++) {
+      auto t = sent_p->get_tokens();
+      for (auto it = t.begin(); it != t.end(); it++) {
          os << *it << " ";
       }
       os << endl;
@@ -190,9 +215,8 @@ ostream& srlgraph_t::operator<<(ostream& os) {
 void srlgraph_t::print(ostream& os, int i) {
    vector<srlnid_type> preds = get_preds();
    if (preds.size() > 0) {
-      for (vector<srlnid_type>::iterator it = preds.begin(); it != preds.end();
-         it++) {
-         vector<string> frame_tokens = tokens_m;
+      for (vector<srlnid_type>::iterator it = preds.begin(); it != preds.end(); it++) {
+         vector<string> frame_tokens = sent_p->get_tokens(get_role_span(root_m));
          span_type pred_span = get_role_span(*it);
          if (pred_span.first != pred_span.second) {
             frame_tokens[pred_span.first] = "[" + get_role_label(*it) + " "
diff --git a/src/srlgraph.h b/src/srlgraph.h
index dc7ca46..dffc71c 100644
--- a/src/srlgraph.h
+++ b/src/srlgraph.h
@@ -19,7 +19,7 @@
 #define SRLGRAPH_H
 
 #include "graph.h"
-
+#include "sent.h"
 #include <utility>
 #include <string>
 #include <vector> 
@@ -30,7 +30,7 @@ namespace yisi {
 
    class srlgraph_t {
    public:
-      typedef std::pair<size_t, size_t> span_type;
+      typedef sent_t::span_type span_type;
       typedef std::string label_type;
       typedef graph_t<span_type, label_type>::node_type srlnode_type;
       typedef graph_t<span_type, label_type>::edge_type srledge_type;
@@ -39,12 +39,12 @@ namespace yisi {
 
 
       srlgraph_t();
-      srlgraph_t(std::vector<std::string>& tokens);
+      srlgraph_t(sent_t* sent);
       srlgraph_t(const srlgraph_t& rhs);
       void operator=(const srlgraph_t& rhs);
 
       srlnid_type new_root();
-      srlnid_type new_root(std::vector<std::string>& tokens);
+      srlnid_type new_root(sent_t* sent);
       srlnid_type new_pred();
       srlnid_type new_pred(span_type& span, label_type& label);
       srlnid_type new_arg(srlnid_type predid);
@@ -56,26 +56,33 @@ namespace yisi {
 
       srlnid_type get_pred(srlnid_type argid);
 
-      std::vector<std::string>& get_sentence();
-      std::vector<std::string> get_role_fillers(srlnid_type roleid);
+      std::vector<std::string> get_sentence();
+      std::vector<std::string> get_role_filler_units(srlnid_type roleid);
+      std::vector<std::vector<double> > get_role_filler_embs(srlnid_type roleid);
 
       label_type get_role_label(srlnid_type roleid);
       span_type get_role_span(srlnid_type roleid);
+      std::string get_sent_type(){return sent_p->get_type();};
+      size_t get_sent_length();
 
       void set_tokens(std::vector<std::string>& tokens);
+      void set_sent(sent_t* sent);
       void set_role_span(srlnid_type predid, span_type& span);
       void set_role_label(srlnid_type predid, label_type& label);
 
       std::ostream& operator<<(std::ostream& os);
       void print(std::ostream& os, int i);
 
+      void delete_sent();
+
    private:
       graph_t<span_type, label_type> srl_m;
-      std::vector<std::string> tokens_m;
+      sent_t* sent_p;
+      // std::vector<std::string> tokens_m;
       srlnid_type root_m;
       std::map<srlnid_type, srlnid_type> predof_m;
    }; // class srlgraph_t
-
+   
    std::ostream& operator<<(std::ostream& os, srlgraph_t& srl);
 
 } // yisi
diff --git a/src/srlgraph_test.cpp b/src/srlgraph_test.cpp
index 0cfbd3c..612df54 100644
--- a/src/srlgraph_test.cpp
+++ b/src/srlgraph_test.cpp
@@ -22,26 +22,21 @@
 using namespace std;
 using namespace yisi;
 
-int main(int argc, char* argv[])
-{
-   ifstream txtstr(argv[1], ifstream::in);
+int main(int argc, char* argv[]) {
 
-   vector<string> sents;
-   string line;
-
-   while (getline(txtstr, line)) {
-      sents.push_back(line);
-   }
+   vector<sent_t*> sents = read_sent("word", string(argv[1]));
 
    cout << "Reading ASSERT format parse file." << endl;
    vector<srlgraph_t> srls = read_srl(sents, string(argv[2]));
 
    cout << "Printing srl parses:" << endl;
-   for (vector<srlgraph_t>::iterator it = srls.begin(); it != srls.end();
-      it++) {
+   for (auto it = srls.begin(); it != srls.end(); it++) {
       cout << (*it);
    }
+   for (auto it = sents.begin(); it != sents.end(); it++) {
+      delete *it;
+      *it = NULL;
+   }
 
    return 0;
 }
-
diff --git a/src/srlmate.cpp b/src/srlmate.cpp
index 8bc9bf7..aa5dc1e 100644
--- a/src/srlmate.cpp
+++ b/src/srlmate.cpp
@@ -73,43 +73,33 @@ srlmate_t::srlmate_t(string path) {
       getline(iss, cfgv);
       if (cfgn == "yisi_home") {
          yisi_home = cfgv;
-      }
-      else if (cfgn == "mate_jars") {
+      } else if (cfgn == "mate_jars") {
          mate_jars = cfgv;
-      }
-      else if (cfgn == "lang") {
+      } else if (cfgn == "lang") {
          lang = cfgv;
-      }
-      else if (cfgn == "rerank") {
+      } else if (cfgn == "rerank") {
          if ((cfgv.compare("0") == 0) || (cfgv.compare("false") == 0)) {
             rerank = false;
          } else {
             rerank = true;
          }
-      }
-      else if (cfgn == "hybrid") {
+      } else if (cfgn == "hybrid") {
          if ((cfgv.compare("0") == 0) || (cfgv.compare("false") == 0)) {
             hybrid = false;
          } else {
             hybrid = true;
          }
-      }
-      else if (cfgn == "token") {
+      } else if (cfgn == "token") {
          token = cfgv;
-      }
-      else if (cfgn == "morph") {
+      } else if (cfgn == "morph") {
          morph = cfgv;
-      }
-      else if (cfgn == "lemma") {
+      } else if (cfgn == "lemma") {
          lemma = cfgv;
-      }
-      else if (cfgn == "tagger") {
+      } else if (cfgn == "tagger") {
          tagger = cfgv;
-      }
-      else if (cfgn == "parser") {
+      } else if (cfgn == "parser") {
          parser = cfgv;
-      }
-      else if (cfgn == "srl") {
+      } else if (cfgn == "srl") {
          srl = cfgv;
       }
    }
@@ -187,18 +177,19 @@ string srlmate_t::noparse(vector<string> tokens) {
    return yisi::strip(result);
 }
 
-string srlmate_t::jrun(string sent) {
+string srlmate_t::jrun(sent_t* sent) {
    string result = "";
-   vector<string> tokens = yisi::tokenize(sent);
+   vector<string> tokens = sent->get_tokens();
+   string sent_str = join(tokens);
 
-   if (!sent.empty() && tokens.size() <= 100) {
+   if (0 < tokens.size() && tokens.size() <= 100) {
       JNI_SAFE_CALL(methid, jen_m,
                     GetMethodID(mate_class_m, "parse",
                                 "(Ljava/lang/String;)Ljava/lang/String;"));
       try {
          JNI_SAFE_CALL(jparse, jen_m,
                        CallObjectMethod(mate_object_m, methid,
-                                        jen_m->NewStringUTF(sent.c_str())));
+                                        jen_m->NewStringUTF(sent_str.c_str())));
          result = jen_m->GetStringUTFChars((jstring)jparse, NULL);
       } catch (...) {
          result += noparse(tokens);
@@ -209,13 +200,13 @@ string srlmate_t::jrun(string sent) {
    return result;
 }
 
-srlgraph_t srlmate_t::parse(string sent) {
+srlgraph_t srlmate_t::parse(sent_t* sent) {
    string srl_str = jrun(sent);
-   srlgraph_t result = read_conll09(srl_str);
+   srlgraph_t result = read_conll09(srl_str, sent);
    return result;
 }
 
-vector<srlgraph_t> srlmate_t::parse(vector<string> sents) {
+vector<srlgraph_t> srlmate_t::parse(vector<sent_t*> sents) {
    //batch srl-ing
    vector<srlgraph_t> result;
    for (auto it = sents.begin(); it != sents.end(); it++) {
diff --git a/src/srlmate.h b/src/srlmate.h
index 0201287..a45bcf2 100644
--- a/src/srlmate.h
+++ b/src/srlmate.h
@@ -35,9 +35,9 @@ namespace yisi {
       srlmate_t() {}
       srlmate_t(std::string path);
       ~srlmate_t();
-      std::string jrun(std::string sent);
-      srlgraph_t parse(std::string sent);
-      virtual std::vector<srlgraph_t> parse(std::vector<std::string> sents);
+      std::string jrun(sent_t* sent);
+      srlgraph_t parse(sent_t* sent);
+      virtual std::vector<srlgraph_t> parse(std::vector<sent_t*> sents);
    private:
       std::string noparse(std::vector<std::string> tokens);
       static JavaVM* jvm_m;
diff --git a/src/srlmate_test.cpp b/src/srlmate_test.cpp
index cc4c2a6..51b7752 100644
--- a/src/srlmate_test.cpp
+++ b/src/srlmate_test.cpp
@@ -25,10 +25,22 @@ int main(const int argc, const char* argv[])
    string sent;
 
    while (getline(cin, sent)) {
-      string mateout = mate.jrun(sent);
+      sent_t* s = new sent_t("word");
+      auto tokens = tokenize(sent);
+      s->set_tokens(tokens);
+      /*
+      auto t = s->get_tokens();
+      for (auto it = t.begin(); it != t.end(); it++) {
+      cerr <<*it <<" ";
+      }
+      cerr<<endl;
+      */
+      string mateout = mate.jrun(s);
       cout << mateout << endl << endl;
-      srlgraph_t result = read_conll09(mateout);
-      cerr << result << endl;
+      srlgraph_t g = read_conll09(mateout, s);
+      cerr << g << endl;
+      delete s;
+      s = NULL;
    }
 
    return 0;
diff --git a/src/srlutil.cpp b/src/srlutil.cpp
index 865decd..91d03cf 100644
--- a/src/srlutil.cpp
+++ b/src/srlutil.cpp
@@ -17,21 +17,22 @@
 #include "util.h"
 
 #include <map>
+#include <set>
 #include <fstream>
 #include <sstream>
 
 using namespace yisi;
 using namespace std;
 
-vector<srlgraph_t> yisi::read_srl(vector<string> sents, string parsefile) {
+vector<srlgraph_t> yisi::read_srl(vector<sent_t*> sents, string parsefile) {
    // read srl in ASSERT format
    vector<srlgraph_t> result;
    typedef srlgraph_t::span_type span_type;
    typedef srlgraph_t::srlnid_type srlnid_type;
 
-   for (vector<string>::iterator it = sents.begin(); it != sents.end(); it++) {
-      vector<string> tokens = tokenize(*it);
-      srlgraph_t s(tokens);
+   for (auto it = sents.begin(); it != sents.end(); it++) {
+      //vector<string> tokens = tokenize(*it);
+      srlgraph_t s(*it);
       result.push_back(s);
    }
 
@@ -102,8 +103,9 @@ vector<srlgraph_t> yisi::read_srl(vector<string> sents, string parsefile) {
             }
          } // while (!iss.eof())
 
-         if ((int)tmptok.size() > 0) {
-            result.at(id).set_tokens(tmptok);
+         if (tmptok.size() > result.at(id).get_sent_length()) {
+            //result.at(id).set_tokens(tmptok);
+            cerr << "ERROR: Tokenization of words changed by srl. Potential index failure!" << endl;
          }
       } // while (!ifs.eof())
       ifs.close();
@@ -112,91 +114,126 @@ vector<srlgraph_t> yisi::read_srl(vector<string> sents, string parsefile) {
    return result;
 }  // read_srl
 
-srlgraph_t yisi::read_conll09(string parse) {
+srlgraph_t yisi::read_conll09(string parse, sent_t* sent) {
+   srlgraph_t result(sent);
    if (parse.empty()) {
-      auto tokens = tokenize(parse);
-      srlgraph_t re(tokens);
-      return re;
+      return result;
    }
-
-   srlgraph_t result;
-   result.new_root();
+   // cerr << result << endl;
+   //result.new_root();
    srlgraph_t::label_type plabel = "V";
+
    vector<string> tokens;
    vector<int> preds;
-   map<int, srlgraph_t::srlnid_type> pids;
-   vector<vector<pair<int, string> > > args;
-   map<int, vector<int> > child;
+   vector<srlgraph_t::srlnid_type> p_nids;
+   vector<vector<srlgraph_t::label_type> > labels;
+   map<int, set<int> > child;
    istringstream iss(parse);
 
+   int n_space = 0;
    while (!iss.eof()) {
       string t;
       getline(iss, t);
       vector<string> field = tokenize(t, '\t', true);
       //ID FORM LEMMA PLEMMA POS PPOS FEAT PFEAT HEAD PHEAD DEPREL PDEPREL FILLPRED PRED APREDs
-      int id = stoi(field[0]) - 1;
-      tokens.push_back(field[1]);
-      int parent = stoi(field[8]);
-      if (parent > 0) {
-         if (child.find(parent - 1) != child.end()) {
-            child[parent - 1].push_back(id);
-         } else {
-            child[parent - 1] = vector<int>(1, id);
+      int id = stoi(field[0]) - 1 -n_space;
+      //cerr << "Reading " << id;
+      if (field[1] != ""){
+         tokens.push_back(field[1]);
+         int p = stoi(field[8]) - n_space;
+         if (p > 0) {
+            child[p - 1].insert(id);
          }
-      }
-      if (field[13] != "_") {
-         preds.push_back(id);
-         srlgraph_t::span_type s(id, id + 1);
-         srlgraph_t::srlnid_type pid = result.new_pred(s, plabel);
-         pids[id] = pid;
-      }
-      for (int i = 14; i < (int)field.size(); i++) {
-         if ((int)args.size() < i - 13) {
-            vector<pair<int, string> > a;
-            args.push_back(a);
+
+         for (int i = 14; i < (int)field.size(); i++) {
+            if (tokens.size() == 1){
+               vector<srlgraph_t::label_type> l;
+               l.push_back(field[i]);
+               labels.push_back(l);
+            } else {
+               labels[i-14].push_back(field[i]);
+            }
+         }
+         if (field[13] != "_") {
+            preds.push_back(id);
+            srlgraph_t::span_type s(id, id + 1);
+            srlgraph_t::srlnid_type pid = result.new_pred(s, plabel);
+            p_nids.push_back(pid);
+            labels[preds.size()-1][id]="V";
          }
-         if (field[i] != "_") {
-            args[i - 14].push_back(make_pair(id, field[i]));
+         //cerr << " Done." << endl;
+      } else {
+         n_space++;
+         if (field[13] != "_") {
+            preds.push_back(-1);
+            p_nids.push_back(10000);
          }
       }
    } // while (!iss.eof())
 
-   result.set_tokens(tokens);
-   for (int i = 0; i < (int)args.size(); i++) {
-      srlgraph_t::srlnid_type pid = pids[preds[i]];
-      for (int j = 0; j < (int)args[i].size(); j++) {
-         int head = args[i][j].first;
-         srlgraph_t::label_type label = args[i][j].second;
-         size_t b = head;
-         size_t e = head;
-         resolve_arg_span(child, head, preds[i], b, e);
-         srlgraph_t::span_type s(b, e + 1);
-         result.new_arg(pid, s, label);
+   if (result.get_sent_type() == "word") {
+      if (tokens.size() > result.get_sent_length()) {
+         if (result.get_sent_length() > 0)
+            cerr << "Set tokens rule fired (" << tokens.size() << ","
+                 << result.get_sent_length() << ")" << endl;
+         result.set_tokens(tokens);
+      }
+   } else {
+      if (result.get_sent_length() > 0 && tokens.size() > result.get_sent_length()) {
+         cerr << "ERROR: Tokenization of words changed by srl. Potential index failure!" << endl;
+         cerr << "Tokens were: " << join(result.get_sentence(), " ") << endl;
+         cerr << "Tokens are: " << join(tokens, " ") << endl;
       }
    }
 
-   return result;
-} // read_conll09
-
-void yisi::resolve_arg_span(map<int, vector<int> > child, int curid,
-                            srlgraph_t::srlnid_type pid, size_t& b, size_t&e) {
-   //cerr << curid << "," << pid << "," << b << "," << e << endl;
-   auto curchild = child[curid];
-   bool find = false;
-   for (auto it = curchild.begin(); it != curchild.end() && !find; it++) {
-      if (*it == (int)pid) {
-         find = true;
+   for (int i = 0; i < (int)labels.size(); i++) {
+      for (int j = 0; j < (int) labels[i].size(); j++){
+         populate_label(labels[i], child, j);
       }
    }
-   if (!find) {
-      for (auto it = curchild.begin(); it != curchild.end(); it++) {
-         if (*it < (int)b) {
-            b = *it;
+   for (int i = 0; i < (int)labels.size(); i++) {
+      auto pid = p_nids[i];
+      if (pid != 10000) {
+         srlgraph_t::span_type curspan;
+         srlgraph_t::label_type curlabel = "_";
+         for (size_t j = 0; j < labels[i].size(); j++) {
+            //cerr << labels[i][j] << " ";
+            if (labels[i][j] != curlabel) {
+               if (curlabel != "_" && curlabel != "V") {
+                  curspan.second = j;
+                  result.new_arg(pid, curspan, curlabel);
+               }
+               curspan.first = j;
+               curlabel = labels[i][j];
+            }
          }
-         if (*it > (int)e) {
-            e = *it;
+         if (curlabel != "_" && curlabel != "V") {
+            curspan.second = labels[i].size();
+            result.new_arg(pid, curspan, curlabel);
+         }
+         //cerr << endl;
+      }
+   }
+   return result;
+} // read_conll09
+
+srlgraph_t yisi::read_conll09(string parse) {
+   sent_t* sent = new sent_t("word");
+
+   auto result = read_conll09(parse, sent);
+
+   return result;
+} // read_conll09
+
+void yisi::populate_label(vector<string>& labels, map<int, set<int> > child, int i) {
+   if (labels[i] != "_" && labels[i] != "V") {
+      auto curchildren = child[i];
+      for (auto ct = curchildren.begin(); ct != curchildren.end(); ct++) {
+         //cerr << "Label " << *ct << " " << labels[*ct] << endl;
+         if (labels[*ct] == "_") {
+            labels[*ct] = labels[i];
+            populate_label(labels, child, *ct);
          }
-         resolve_arg_span(child, *it, pid, b, e);
       }
    }
 }
@@ -211,13 +248,40 @@ vector<srlgraph_t> yisi::read_conll09batch(string filename) {
    }
 
    string parse;
-
+   int i=0;
    while (!fin.eof()) {
       string line;
       getline(fin, line);
       if (line.empty()) {
          result.push_back(read_conll09(yisi::strip(parse)));
          parse = "";
+         i++;
+      } else {
+         parse += line + "\n";
+      }
+      fin.peek();
+   }
+   return result;
+}
+
+vector<srlgraph_t> yisi::read_conll09batch(string filename, vector<sent_t*> sents) {
+   vector<srlgraph_t> result;
+
+   ifstream fin(filename.c_str());
+   if (!fin) {
+      cerr << "ERROR: Failed to open conll09 parse  file (" << filename << "). Exiting..." << endl;
+      exit(1);
+   }
+
+   string parse;
+   int i=0;
+   while (!fin.eof()) {
+      string line;
+      getline(fin, line);
+      if (line.empty()) {
+         result.push_back(read_conll09(yisi::strip(parse), sents[i]));
+         parse = "";
+         i++;
       } else {
          parse += line + "\n";
       }
@@ -228,21 +292,19 @@ vector<srlgraph_t> yisi::read_conll09batch(string filename) {
 
 srlread_t::srlread_t(string parsefile):parsefile_m(parsefile) {}
 
-vector<srlgraph_t> srlread_t::parse(vector<string> sents) {
-   return yisi::read_conll09batch(parsefile_m);
+vector<srlgraph_t> srlread_t::parse(vector<sent_t*> sents) {
+   return yisi::read_conll09batch(parsefile_m, sents);
 }
 
-vector<srlgraph_t> srltok_t::parse(vector<string> sents) {
+vector<srlgraph_t> srltok_t::parse(vector<sent_t*> sents) {
    vector<srlgraph_t> result;
    for (auto it = sents.begin(); it != sents.end(); it++) {
-      auto tokens = yisi::tokenize(*it);
-      result.push_back(srlgraph_t(tokens));
+      result.push_back(srlgraph_t(*it));
    }
    return result;
 }
 
-srlgraph_t srltok_t::parse(string sent) {
-   auto tokens = yisi::tokenize(sent);
-   auto result = srlgraph_t(tokens);
+srlgraph_t srltok_t::parse(sent_t* sent) {
+   auto result = srlgraph_t(sent);
    return result;
 }
diff --git a/src/srlutil.h b/src/srlutil.h
index ab40638..499f1b5 100644
--- a/src/srlutil.h
+++ b/src/srlutil.h
@@ -22,35 +22,36 @@
 
 #include "srlgraph.h"
 
+#include <set>
 #include <string>
 #include <vector> 
 #include <map>
 
 namespace yisi {
 
-   std::vector<srlgraph_t> read_srl(std::vector<std::string> sents, std::string parsefile);
+   std::vector<srlgraph_t> read_srl(std::vector<sent_t*> sents, std::string parsefile);
+   srlgraph_t read_conll09(std::string parse, sent_t* sent);
    srlgraph_t read_conll09(std::string parse);
-   void resolve_arg_span(std::map<int, std::vector<int> > child, int curid,
-      srlgraph_t::srlnid_type pid, size_t& b, size_t&e);
+   void populate_label(std::vector<std::string>& labels, std::map<int, std::set<int> > child, int i);
    std::vector<srlgraph_t> read_conll09batch(std::string filename);
-
+   std::vector<srlgraph_t> read_conll09batch(std::string filename, std::vector<sent_t*> sents);
    class srlmodel_t {
    public:
       srlmodel_t() {}
       virtual ~srlmodel_t() {}
-      virtual srlgraph_t parse(std::string) {
+      virtual srlgraph_t parse(sent_t* sent) {
          std::cerr << "ERROR: Semantic role labeler type does not support "
                    << "individual sentence parsing. Exiting..." << std::endl;
          exit(1);
       }
-      virtual std::vector<srlgraph_t> parse(std::vector<std::string>)=0;
+      virtual std::vector<srlgraph_t> parse(std::vector<sent_t*> sent)=0;
    }; // srlmodel_t
 
    class srlread_t:public srlmodel_t {
    public:
       srlread_t() {}
       srlread_t(std::string parsefile);
-      virtual std::vector<srlgraph_t> parse(std::vector<std::string> sents);
+      virtual std::vector<srlgraph_t> parse(std::vector<sent_t*> sents);
    private:
       std::string parsefile_m;
    }; // class srlread_t
@@ -58,8 +59,8 @@ namespace yisi {
    class srltok_t:public srlmodel_t {
    public:
       srltok_t() {}
-      virtual srlgraph_t parse(std::string sent);
-      virtual std::vector<srlgraph_t> parse(std::vector<std::string> sents);
+      virtual srlgraph_t parse(sent_t* sent);
+      virtual std::vector<srlgraph_t> parse(std::vector<sent_t*> sents);
    private:
    }; //class srltok_t
 
diff --git a/src/srlutil_test.cpp b/src/srlutil_test.cpp
index df45cf9..d7ba9d9 100644
--- a/src/srlutil_test.cpp
+++ b/src/srlutil_test.cpp
@@ -23,10 +23,14 @@ int main(const int argc, const char* argv[])
 {
    auto s = read_conll09batch(argv[1]);
 
-   for (auto it=s.begin(); it!=s.end(); it++){
+   for (auto it = s.begin(); it != s.end(); it++) {
       cout << *it;
    }
 
+   for (auto it = s.begin(); it != s.end(); it++) {
+      it->delete_sent();
+   }
+
    return 0;
 }
 
diff --git a/src/util.cpp b/src/util.cpp
index 892750e..59cf30f 100644
--- a/src/util.cpp
+++ b/src/util.cpp
@@ -29,13 +29,15 @@ using namespace std;
 
 vector<string> yisi::tokenize(string sent, char d, bool keep_empty) {
    //cerr << "Tokenizing " << sent << " by " << d << endl;
-   vector<string> result;
-   istringstream iss(sent);
-   while (!iss.eof()) {
-      string token;
-      getline(iss, token, d);
-      if (token != "" || keep_empty) {
-         result.push_back(token);
+   vector <string> result;
+   if (sent != "") {
+      istringstream iss(sent);
+      while (!iss.eof()) {
+         string token;
+         getline(iss, token, d);
+         if (token != "" || keep_empty) {
+            result.push_back(token);
+         }
       }
    }
    //cerr << endl;
@@ -52,18 +54,6 @@ string yisi::join(const vector<string> tokens, const string d) {
    return result;
 }
 
-vector<vector<string> > yisi::collect_ngram(int n, vector<string>& tokens) {
-  vector <vector<string> > result;
-  for (int i = 0; i <= (int)tokens.size() - n; i++) {
-    vector <string > ngram;
-    for (int j = i; j < i + n; j++) {
-      ngram.push_back(tokens[j]);
-    }
-    result.push_back(ngram);
-  }
-  return result;
-}
-
 vector<string> yisi::read_file(string filename) {
    vector<string> result;
    ifstream fin(filename.c_str());
diff --git a/src/util.h b/src/util.h
index b3aba7f..cae1439 100644
--- a/src/util.h
+++ b/src/util.h
@@ -25,7 +25,17 @@
 namespace yisi {
    std::vector<std::string> tokenize(std::string sent, char d = ' ', bool keep_empty = false);
    std::string join(const std::vector<std::string> tokens, const std::string d = " ");
-   std::vector<std::vector<std::string> > collect_ngram(int n, std::vector<std::string>& tokens);
+   template<class T> std::vector<std::vector<T> > collect_ngram(int n, std::vector<T>& tokens){
+      std::vector<std::vector<T> > result;
+      for (int i = 0; i <= (int)tokens.size() - n; i++) {
+         std::vector<T> ngram;
+         for (int j = i; j < i + n; j++) {
+            ngram.push_back(tokens[j]);
+         }
+         result.push_back(ngram);
+      }
+      return result;
+   }
    std::vector<std::string> read_file(std::string filename);
    void open_ofstream(std::ofstream& fout, std::string filename);
    std::string lowercase(std::string token);
diff --git a/src/yisi.cpp b/src/yisi.cpp
index 37cb1f4..a3523f4 100644
--- a/src/yisi.cpp
+++ b/src/yisi.cpp
@@ -24,28 +24,54 @@ using namespace std;
 using namespace yisi;
 
 struct eval_options {
+   std::string ref_type_m;
+   std::string hyp_type_m;
+   std::string inp_type_m;
+
    std::string ref_file_m;
    std::string hyp_file_m;
    std::string inp_file_m;
+   std::string inpunit_file_m;
+   std::string refunit_file_m;
+   std::string hypunit_file_m;
+   std::string inpidemb_file_m;
+   std::string refidemb_file_m;
+   std::string hypidemb_file_m;
+
    std::string sntscore_file_m;
    std::string docscore_file_m;
+
    std::string mode_m;
 
    void init(com::masaers::cmdlp::parser& p) {
       using namespace com::masaers::cmdlp;
-
+      p.add(make_knob(ref_type_m))
+         .fallback("word")
+         .desc("Type of reference sentences. [word(default)|unit|uemb]")
+         .name("ref-type")
+         ;
+      p.add(make_knob(hyp_type_m))
+         .fallback("word")
+         .desc("Type of hypothese sentences. [word(default)|unit|uemb]")
+         .name("hyp-type")
+         ;
+      p.add(make_knob(inp_type_m))
+         .fallback("word")
+         .desc("Filename of input. [word(default)|unit|uemb]")
+         .name("inp-type")
+         ;
       p.add(make_knob(ref_file_m))
          .fallback("")
-         .desc("Filenames of references separated by ':'")
+         .desc("Filenames of references separated by ':'. (in surface word form for SRL.)")
          .name("ref-file")
          ;
       p.add(make_knob(hyp_file_m))
-         .desc("Filename of hypotheses")
+         .desc("Filename of hypotheses. (in surface word form for SRL.)")
          .name("hyp-file")
          ;
       p.add(make_knob(inp_file_m))
          .fallback("")
-         .desc("Filename of input")
+         .desc("Filename of input. (in surface word form for SRL.)")
          .name("inp-file")
          ;
       p.add(make_knob(sntscore_file_m))
@@ -58,6 +84,39 @@ struct eval_options {
          .desc("Filename of document score output (default: <sntscore-file>.doc")
          .name("docscore-file")
          ;
+      p.add(make_knob(inpunit_file_m))
+         .fallback("")
+         .desc("Filename to input segmented in subword units.")
+         .name("inpunit-file")
+         ;
+      p.add(make_knob(hypunit_file_m))
+         .fallback("")
+         .desc("Filename to hypotheses segmented in subword units.")
+         .name("hypunit-file")
+         ;
+      p.add(make_knob(refunit_file_m))
+         .fallback("")
+         .desc("Filename to reference segmented in subword units separated by ':'.")
+         .name("refunit-file")
+         ;
+      p.add(make_knob(inpidemb_file_m))
+         .fallback("")
+         .desc("Filename to input subword units with contextual embeddings: one unit per line, "
+               "empty line separates sentences [unitid<TAB>tokenid<TAB>space_sep_emb].")
+         .name("inpidemb-file")
+         ;
+      p.add(make_knob(hypidemb_file_m))
+         .fallback("")
+         .desc("Filename to hypotheses subword units with contextual embeddings: one unit per line, "
+               "empty line separates sentences [unitid<TAB>tokenid<TAB>space_sep_emb].")
+         .name("hypidemb-file")
+         ;
+      p.add(make_knob(refidemb_file_m))
+         .fallback("")
+         .desc("Filename to reference subword units with contextual embeddings separated by ':': one "
+               "unit per line, empty line separates sentences [unitid<TAB>tokenid<TAB>space_sep_emb].")
+         .name("refidemb-file")
+         ;
       p.add(make_knob(mode_m))
          .fallback("yisi")
          .desc("Output mode of YiSi [yisi(default): print score only "
@@ -71,19 +130,31 @@ int main(const int argc, const char* argv[])
 {
    typedef com::masaers::cmdlp::options<eval_options, yisi_options, phrasesim_options> options_type;
 
-   options_type opt(argc,argv);
-   if (! opt) {
+   options_type opt(argc, argv);
+   if (!opt) {
       return opt.exit_code();
    }
 
    if (opt.reflexweight_name_m == "learn" && opt.reflexweight_path_m == "") {
-      opt.reflexweight_path_m = opt.ref_file_m;
+      if (opt.ref_type_m == "word") {
+         opt.reflexweight_path_m = opt.ref_file_m;
+      } else {
+         opt.reflexweight_path_m = opt.refunit_file_m;
+      }
    }
    if (opt.hyplexweight_name_m == "learn" && opt.hyplexweight_path_m == "") {
-      opt.hyplexweight_path_m = opt.hyp_file_m;
+      if (opt.hyp_type_m == "word") {
+         opt.hyplexweight_path_m = opt.hyp_file_m;
+      } else {
+         opt.hyplexweight_path_m = opt.hypunit_file_m;
+      }
    }
    if (opt.inplexweight_name_m == "learn" && opt.inplexweight_path_m == "") {
-      opt.inplexweight_path_m = opt.inp_file_m;
+      if (opt.inp_type_m == "word") {
+         opt.inplexweight_path_m = opt.inp_file_m;
+      } else {
+         opt.inplexweight_path_m = opt.inpunit_file_m;
+      }
    }
 
    yisiscorer_t<options_type> yisi(opt);
@@ -99,65 +170,77 @@ int main(const int argc, const char* argv[])
    ofstream SNTOUT;
    open_ofstream(SNTOUT, opt.sntscore_file_m);
 
-   vector<string> hypsents = read_file(opt.hyp_file_m);
+   cerr << "Reading hyp sents... ";
+   vector<sent_t*> hypsents = read_sent(opt.hyp_type_m, opt.hyp_file_m, opt.hypunit_file_m, opt.hypidemb_file_m);
+   cerr << "Done." << endl;
 
-   vector<vector<string> > refsents;
+   vector < vector<sent_t*> > refsents;
    if (opt.ref_file_m != "") {
+      cerr << "Reading ref sents... ";
       auto reffiles = tokenize(opt.ref_file_m, ':');
-      auto it = reffiles.begin();
-      vector < string > rs = read_file(*it);
+      auto refunits = tokenize(opt.refunit_file_m, ':');
+      auto refidemb = tokenize(opt.refidemb_file_m, ':');
+      size_t i = 0;
+      vector<sent_t*> rs;
+      if (reffiles.size() == refunits.size()) {
+         rs = read_sent(opt.ref_type_m, reffiles[i], refunits[i], refidemb[i]);
+      } else {
+         rs = read_sent(opt.ref_type_m, reffiles[i]);
+      }
       if (rs.size() == hypsents.size()) {
          for (auto jt = rs.begin(); jt != rs.end(); jt++) {
-            vector < string > ref;
+            vector<sent_t*> ref;
             ref.push_back(*jt);
             refsents.push_back(ref);
          }
-         it++;
-         for (; it != reffiles.end(); it++) {
-            rs = read_file(*it);
+         i++;
+         for (; i < reffiles.size(); i++) {
+            rs = read_sent(opt.ref_type_m, reffiles[i], refunits[i], refidemb[i]);
             if (rs.size() == hypsents.size()) {
                for (size_t j = 0; j < rs.size(); j++) {
                   refsents[j].push_back(rs[j]);
                }
             } else {
                cerr << "ERROR: No. of sentences in ref-file (" << rs.size()
-                    << ") does not match with no. of sentences in hyp-file ("
-                    << hypsents.size() << "). Check your input! Exiting ..."
-                    << endl;
+                  << ") does not match with no. of sentences in hyp-file ("
+                  << hypsents.size() << "). Check your input! Exiting ..."
+                  << endl;
                exit(1);
             }
          }
       } else {
          cerr << "ERROR: No. of sentences in ref-file (" << rs.size()
-              << ") does not match with no. of sentences in hyp-file ("
-              << hypsents.size() << "). Check your input! Exiting ..." << endl;
+            << ") does not match with no. of sentences in hyp-file ("
+            << hypsents.size() << "). Check your input! Exiting ..." << endl;
          exit(1);
       }
+      cerr << "Done." << endl;
    }
 
-   vector<string> inpsents;
+   vector<sent_t*> inpsents;
    if (opt.inp_file_m != "") {
-      inpsents = read_file(opt.inp_file_m);
-   }
-
-   if (inpsents.size() > 0 && inpsents.size() != hypsents.size()) {
-      cerr << "ERROR: No. of sentences in inp-file (" << inpsents.size()
-              << ") does not match with no. of sentences in hyp-file ("
-              << hypsents.size() << "). Check your input! Exiting..." << endl;
-      exit(1);
+      cerr << "Reading inp sents... ";
+      inpsents = read_sent(opt.inp_type_m, opt.inp_file_m, opt.inpunit_file_m, opt.inpidemb_file_m);
+      if (inpsents.size() != hypsents.size()) {
+         cerr << "ERROR: No. of sentences in inp-file (" << inpsents.size()
+            << ") does not match with no. of sentences in hyp-file ("
+            << hypsents.size() << "). Check your input! Exiting..." << endl;
+         exit(1);
+      }
+      cerr << "Done." << endl;
    }
 
-   cerr << "Tokenizing/SRL-ing hyp ... ";
+   cerr << "Creating hyp srlgraphs... ";
    vector<srlgraph_t> hypsrlgraphs = yisi.hypsrlparse(hypsents);
    cerr << "Done." << endl;
-   vector<vector<srlgraph_t> > refsrlgraphs;
+   vector < vector<srlgraph_t> > refsrlgraphs;
 
    for (size_t i = 0; i < hypsrlgraphs.size(); i++) {
       refsrlgraphs.push_back(vector<srlgraph_t>());
    }
 
    if (refsents.size() > 0) {
-      cerr << "Tokenizing/SRL-ing ref ... ";
+      cerr << "Creating ref srlgraphs... ";
       for (size_t i = 0; i < hypsrlgraphs.size(); i++) {
          refsrlgraphs[i] = yisi.refsrlparse(refsents[i]);
       }
@@ -166,7 +249,7 @@ int main(const int argc, const char* argv[])
 
    vector<srlgraph_t> inpsrlgraphs;
    if (inpsents.size() > 0) {
-      cerr << "Tokenizing/SRL-ing inp ... ";
+      cerr << "Creating inp srlgraphs... ";
       inpsrlgraphs = yisi.inpsrlparse(inpsents);
       cerr << "Done." << endl;
    }
@@ -177,9 +260,19 @@ int main(const int argc, const char* argv[])
       cout << "Evaluating line " << i + 1 << endl;
       yisigraph_t m;
       if (opt.inp_file_m != "") {
+         /*
+          cerr<<"inpsrlgraph:"<<endl;
+          inpsrlgraphs[i].print(cout, i);
+          cerr<<"hypsrlgraph:"<<endl;
+          hypsrlgraphs[i].print(cout, i);
+          cerr<<"yisigraph:"<<endl;
+          */
          m = yisi.align(refsrlgraphs[i], hypsrlgraphs[i], inpsrlgraphs[i]);
+         // m.print(cout);
       } else {
+         // hypsrlgraphs[i].print(cout, i);
          m = yisi.align(refsrlgraphs[i], hypsrlgraphs[i]);
+         // m.print(cout);
       }
       if (opt.mode_m != "features") {
          double s = yisi.score(m);
@@ -203,5 +296,20 @@ int main(const int argc, const char* argv[])
       DOCOUT.close();
    }
 
+   for (auto it = hypsents.begin(); it != hypsents.end(); it++) {
+      delete *it;
+      *it = NULL;
+   }
+   for (auto it = refsents.begin(); it != refsents.end(); it++) {
+      for (auto jt = it->begin(); jt != it->end(); jt++) {
+         delete *jt;
+         *jt = NULL;
+      }
+   }
+   for (auto it = inpsents.begin(); it != inpsents.end(); it++) {
+      delete *it;
+      *it = NULL;
+   }
+
    return 0;
 }
diff --git a/src/yisigraph.cpp b/src/yisigraph.cpp
index cc05785..fac9768 100644
--- a/src/yisigraph.cpp
+++ b/src/yisigraph.cpp
@@ -22,7 +22,8 @@
 using namespace yisi;
 using namespace std;
 
-yisigraph_t::yisigraph_t(const vector<srlgraph_t> refsrlgraph, const srlgraph_t hypsrlgraph) {
+yisigraph_t::yisigraph_t(const vector<srlgraph_t> refsrlgraph, 
+			 const srlgraph_t hypsrlgraph) {
    refsrlgraph_m = refsrlgraph;
    hypsrlgraph_m = hypsrlgraph;
    inp_b = false;
@@ -32,7 +33,7 @@ yisigraph_t::yisigraph_t(const vector<srlgraph_t> refsrlgraph, const srlgraph_t
    //cout << refsrlgraph_m;
    //cout << "hypsrlgraph:" << endl;
    //cout << hypsrlgraph_m;
-   //cout<<"Done."<<endl;
+   //cout<<"Done."<<endl;   
 }
 
 yisigraph_t::yisigraph_t(const vector<srlgraph_t> refsrlgraph,
@@ -82,6 +83,7 @@ size_t yisigraph_t::get_refsize() {
    return refsrlgraph_m.size();
 }
 
+/*
 double yisigraph_t::get_sentlength(int mode, int refid) {
    switch (mode) {
       case yisi::INP_MODE:
@@ -104,10 +106,11 @@ double yisigraph_t::get_sentlength(int mode, int refid) {
          }
          break;
       default:
-         cerr << "ERROR: Unknown mode in sent length. Contact Jackie. Exiting..." << endl;
+	cerr << "ERROR: Unknown mode in sent length. Contact Jackie. Exiting..." << endl;
          exit(1);
    }
 }
+*/
 
 double yisigraph_t::get_sentsim(int mode, int refid) {
    double result = 0.0;
@@ -203,6 +206,7 @@ vector<yisigraph_t::srlnid_type> yisigraph_t::get_args(srlnid_type roleid, int m
       }
 }
 
+/*
 vector<string>& yisigraph_t::get_sentence(int mode, int refid) {
    switch (mode) {
       case yisi::INP_MODE:
@@ -231,12 +235,13 @@ vector<string>& yisigraph_t::get_sentence(int mode, int refid) {
          exit(1);
    }
 }
+*/
 
-vector<string> yisigraph_t::get_role_fillers(srlnid_type roleid, int mode, int refid) {
+vector<string> yisigraph_t::get_role_filler_units(srlnid_type roleid, int mode, int refid) {
    switch (mode) {
       case yisi::INP_MODE:
          if (inp_b) {
-            return inpsrlgraph_m.get_role_fillers(roleid);
+            return inpsrlgraph_m.get_role_filler_units(roleid);
          } else {
             cerr << "ERROR: YiSi graph with no input sentence. "
                  << "Failed to get input role fillers. Exiting..." << endl;
@@ -244,11 +249,11 @@ vector<string> yisigraph_t::get_role_fillers(srlnid_type roleid, int mode, int r
          }
          break;
       case yisi::HYP_MODE:
-         return hypsrlgraph_m.get_role_fillers(roleid);
+         return hypsrlgraph_m.get_role_filler_units(roleid);
          break;
       case yisi::REF_MODE:
          if (-1 < refid && refid < (int)refsrlgraph_m.size()) {
-            return refsrlgraph_m[refid].get_role_fillers(roleid);
+            return refsrlgraph_m[refid].get_role_filler_units(roleid);
          } else {
             cerr << "ERROR: refid (" << refid << ") out of range [0," << refsrlgraph_m.size()
                  << "]. Failed to get reference role fillers. Exiting..." << endl;
@@ -437,29 +442,29 @@ double yisigraph_t::spanlength(span_type span) {
 }
 
 void yisigraph_t::print(ostream& os) {
-   string h = yisi::join(hypsrlgraph_m.get_sentence(), " ");
+   string h = yisi::join(hypsrlgraph_m.get_role_filler_units(hypsrlgraph_m.get_root()), " ");
    //os << h <<endl;
    for (size_t i = 0; i < refalignment_m.size(); i++) {
-      string r = yisi::join(refsrlgraph_m[i].get_sentence(), " ");
+      string r = yisi::join(refsrlgraph_m[i].get_role_filler_units(refsrlgraph_m[i].get_root()), " ");
       //os << r <<endl;
       for (auto jt = refalignment_m[i].begin(); jt != refalignment_m[i].end(); jt++) {
          auto refnid = jt->first;
          auto hypnid = (jt->second).first;
          double sim = (jt->second).second;
-         r = yisi::join(refsrlgraph_m[i].get_role_fillers(refnid), " ");
-         h = yisi::join(hypsrlgraph_m.get_role_fillers(hypnid), " ");
+         r = yisi::join(refsrlgraph_m[i].get_role_filler_units(refnid), " ");
+         h = yisi::join(hypsrlgraph_m.get_role_filler_units(hypnid), " ");
          os << r << "\t" << h << "\t" << sim << endl;
       }
    }
    if (inp_b) {
-      string inp = yisi::join(inpsrlgraph_m.get_sentence(), " ");
+      string inp = yisi::join(inpsrlgraph_m.get_role_filler_units(inpsrlgraph_m.get_root()), " ");
       os << inp << endl;
       for (auto kt = inpalignment_m.begin(); kt != inpalignment_m.end(); kt++) {
          auto inpnid = kt->first;
          auto hypnid = (kt->second).first;
          double sim = (kt->second).second;
-         inp = yisi::join(inpsrlgraph_m.get_role_fillers(inpnid), " ");
-         h = yisi::join(hypsrlgraph_m.get_role_fillers(hypnid), " ");
+         inp = yisi::join(inpsrlgraph_m.get_role_filler_units(inpnid), " ");
+         h = yisi::join(hypsrlgraph_m.get_role_filler_units(hypnid), " ");
          os << inp << "\t" << h << "\t" << sim << endl;
       }
    }
diff --git a/src/yisigraph.h b/src/yisigraph.h
index 07f37a3..69a8211 100644
--- a/src/yisigraph.h
+++ b/src/yisigraph.h
@@ -29,7 +29,7 @@
 
 namespace yisi {
 
-   class yisigraph_t{
+   class yisigraph_t {
    public:
       typedef srlgraph_t::span_type span_type;
       typedef srlgraph_t::label_type label_type;
@@ -49,12 +49,12 @@ namespace yisi {
 
       bool withinp();
       size_t get_refsize();
-      double get_sentlength(int mode, int refid=-1);
+      // double get_sentlength(int mode, int refid=-1);
       double get_sentsim(int mode, int refid=-1);
       std::vector<srlnid_type> get_preds(int mode, int refid=-1);
       std::vector<srlnid_type> get_args(srlnid_type roleid, int mode, int refid=-1);
-      std::vector<std::string>& get_sentence(int mode, int refid=-1);
-      std::vector<std::string> get_role_fillers(srlnid_type roleid, int mode, int refid=-1);
+      // std::vector<std::string>& get_sentence(int mode, int refid=-1);
+      std::vector<std::string> get_role_filler_units(srlnid_type roleid, int mode, int refid=-1);
       double get_rolespanlength(srlnid_type roleid, int mode, int refid=-1);
       label_type get_rolelabel(srlnid_type roleid, int mode, int refid=-1);
       std::vector<std::pair<int, alignment_type> > get_hypalignment(srlnid_type roleid);
@@ -73,23 +73,33 @@ namespace yisi {
       std::map<srlnid_type, std::vector<std::pair<int, alignment_type> > > hypalignment_m;
       std::map<srlnid_type, alignment_type> inpalignment_m;
       bool inp_b;
-   }; // class yisigraph_t
-
+  }; // class yisigraph_t
+  
    template <typename T>
    void yisigraph_t::align(phrasesim_t<T>* phrasesim) {
       //yisi alignment algorithm goes here
       //loop all references and input
       for (size_t refid = 0; refid < refsrlgraph_m.size(); refid++) {
-         //std::cerr << "first align the sentence node" << std::endl;
-         auto r = refsrlgraph_m[refid].get_sentence();
-         //std::cerr << "Got r " << r.size() << std::endl;
-         auto h = hypsrlgraph_m.get_sentence();
-         //std::cerr << "Got h " << h.size() << std::endl;
-         std::pair<double, double> sentsim = (*phrasesim)(r, h, yisi::REF_MODE);
-         //std::cerr << "sentsim = (" << sentsim.first << "," << sentsim.second << ")";
+         //std::cerr << "first align the sentence node of ref" << refid << std::endl;
          auto refroot = refsrlgraph_m[refid].get_root();
-         //std::cerr << "refroot = " << refroot << std::endl;
          auto hyproot = hypsrlgraph_m.get_root();
+
+         auto ru = refsrlgraph_m[refid].get_role_filler_units(refroot);
+         //std::cerr << "Got r " << ru.size() << std::endl;
+         auto hu = hypsrlgraph_m.get_role_filler_units(hyproot);
+         //std::cerr << "Got h " << hu.size() << std::endl;
+         std::pair<double, double> sentsim;
+         if (refsrlgraph_m[refid].get_sent_type() != "uemb" ||  hypsrlgraph_m.get_sent_type() != "uemb") {
+            //std::cerr<<"computing sentsim on word"<<std::endl;
+            sentsim = (*phrasesim)(ru, hu, yisi::REF_MODE);
+         } else {
+            auto remb = refsrlgraph_m[refid].get_role_filler_embs(refroot);
+            auto hemb = hypsrlgraph_m.get_role_filler_embs(hyproot);
+            sentsim = (*phrasesim)(ru, hu, remb, hemb, yisi::REF_MODE);
+         }
+
+         //std::cerr << "sentsim = (" << sentsim.first << "," << sentsim.second << ")";
+         //std::cerr << "refroot = " << refroot << std::endl;
          //std::cerr << "hyproot = " << hyproot << std::endl;
          refalignment_m.push_back(std::map<srlnid_type, alignment_type>());
          //std::cerr << "Done creating refalignment map" << std::endl;
@@ -109,14 +119,20 @@ namespace yisi {
             auto refpredid = *it;
             auto refpredspan = refsrlgraph_m[refid].get_role_span(refpredid);
             if (refpredspan.first != refpredspan.second) {
-               auto refpredphrase = refsrlgraph_m[refid].get_role_fillers(refpredid);
+               auto refpredphrase = refsrlgraph_m[refid].get_role_filler_units(refpredid);
                for (auto jt = hyppreds.begin(); jt != hyppreds.end(); jt++) {
                   auto hyppredid = *jt;
                   auto hyppredspan = hypsrlgraph_m.get_role_span(hyppredid);
                   if (hyppredspan.first != hyppredspan.second) {
-                     auto hyppredphrase = hypsrlgraph_m.get_role_fillers(hyppredid);
-                     std::pair<double, double> predsim =
-                        (*phrasesim)(refpredphrase, hyppredphrase, yisi::REF_MODE);
+                     auto hyppredphrase = hypsrlgraph_m.get_role_filler_units(hyppredid);
+                     std::pair<double, double> predsim;
+                     if (refsrlgraph_m[refid].get_sent_type() != "uemb" ||  hypsrlgraph_m.get_sent_type() != "uemb") {
+                        predsim = (*phrasesim)(refpredphrase, hyppredphrase, yisi::REF_MODE);
+                     } else {
+                        auto rpredemb = refsrlgraph_m[refid].get_role_filler_embs(refpredid);
+                        auto hpredemb = hypsrlgraph_m.get_role_filler_embs(hyppredid);
+                        predsim = (*phrasesim)(refpredphrase, hyppredphrase, rpredemb, hpredemb, yisi::REF_MODE);
+                     }
                      refpredmatch.add_weight(refpredid, hyppredid, predsim.second);
                      hyppredmatch.add_weight(refpredid, hyppredid, predsim.first);
                   }
@@ -139,12 +155,18 @@ namespace yisi {
             maxmatching_t argmatch;
             for (auto it = refargs.begin(); it != refargs.end(); it++) {
                auto refargid = *it;
-               auto refargphrase = refsrlgraph_m[refid].get_role_fillers(refargid);
+               auto refargphrase = refsrlgraph_m[refid].get_role_filler_units(refargid);
                for (auto jt = hypargs.begin(); jt != hypargs.end(); jt++) {
                   auto hypargid = *jt;
-                  auto hypargphrase = hypsrlgraph_m.get_role_fillers(hypargid);
-                  std::pair<double, double> argsim =
-                     (*phrasesim)(refargphrase, hypargphrase, yisi::REF_MODE);
+                  auto hypargphrase = hypsrlgraph_m.get_role_filler_units(hypargid);
+                  std::pair<double, double> argsim;
+                  if (refsrlgraph_m[refid].get_sent_type() != "uemb" ||  hypsrlgraph_m.get_sent_type() != "uemb") {
+                     argsim = (*phrasesim)(refargphrase, hypargphrase, yisi::REF_MODE);
+                  } else {
+                     auto rargemb = refsrlgraph_m[refid].get_role_filler_embs(refargid);
+                     auto hargemb = hypsrlgraph_m.get_role_filler_embs(hypargid);
+                     argsim = (*phrasesim)(refargphrase, hypargphrase, rargemb, hargemb, yisi::REF_MODE);
+                  }
                   argmatch.add_weight(refargid, hypargid, argsim.second);
                } // for jt
             } // for it
@@ -164,22 +186,27 @@ namespace yisi {
             auto aligned_hyp_pred = hpr[i].first.second;
             auto psim = hpr[i].second;
             if (hypalignment_m.find(aligned_hyp_pred) == hypalignment_m.end()) {
-               hypalignment_m[aligned_hyp_pred] =
-                                 std::vector<std::pair<int, alignment_type> >();
+               hypalignment_m[aligned_hyp_pred] = std::vector<std::pair<int, alignment_type> >();
             }
             hypalignment_m[aligned_hyp_pred].push_back(std::make_pair(refid,
-                                       alignment_type(aligned_ref_pred, psim)));
+               alignment_type(aligned_ref_pred, psim)));
             auto refargs = refsrlgraph_m[refid].get_args(aligned_ref_pred);
             auto hypargs = hypsrlgraph_m.get_args(aligned_hyp_pred);
             maxmatching_t argmatch;
             for (auto it = refargs.begin(); it != refargs.end(); it++) {
                auto refargid = *it;
-               auto refargphrase = refsrlgraph_m[refid].get_role_fillers(refargid);
+               auto refargphrase = refsrlgraph_m[refid].get_role_filler_units(refargid);
                for (auto jt = hypargs.begin(); jt != hypargs.end(); jt++) {
                   auto hypargid = *jt;
-                  auto hypargphrase = hypsrlgraph_m.get_role_fillers(hypargid);
-                  std::pair<double, double> argsim =
-                     (*phrasesim)(refargphrase, hypargphrase, yisi::REF_MODE);
+                  auto hypargphrase = hypsrlgraph_m.get_role_filler_units(hypargid);
+                  std::pair<double, double> argsim;
+                  if (refsrlgraph_m[refid].get_sent_type() != "uemb" ||  hypsrlgraph_m.get_sent_type() != "uemb") {
+                     argsim = (*phrasesim)(refargphrase, hypargphrase, yisi::REF_MODE);
+                  } else {
+                     auto rargemb = refsrlgraph_m[refid].get_role_filler_embs(refargid);
+                     auto hargemb = hypsrlgraph_m.get_role_filler_embs(hypargid);
+                     argsim = (*phrasesim)(refargphrase, hypargphrase, rargemb, hargemb, yisi::REF_MODE);
+                  }
                   argmatch.add_weight(refargid, hypargid, argsim.first);
                } // for jt
             } // for it
@@ -190,26 +217,38 @@ namespace yisi {
                auto asim = ar[j].second;
                if (hypalignment_m.find(aligned_hyp_arg) == hypalignment_m.end()) {
                   hypalignment_m[aligned_hyp_arg] =
-                                 std::vector<std::pair<int, alignment_type> >();
+                     std::vector<std::pair<int, alignment_type> >();
                }
                hypalignment_m[aligned_hyp_arg].push_back(std::make_pair(refid,
-                                          alignment_type(aligned_ref_arg, asim)));
+                  alignment_type(aligned_ref_arg, asim)));
             }  // for j
          } // for i
       } // for refid
       //input
       if (inp_b) {
-         auto r = inpsrlgraph_m.get_sentence();
-         auto h = hypsrlgraph_m.get_sentence();
-         std::pair<double, double> sentsim = (*phrasesim)(r, h, yisi::INP_MODE);
+         //std::cerr << "first align the sentence node of inp: ";
          auto inproot = inpsrlgraph_m.get_root();
          auto hyproot = hypsrlgraph_m.get_root();
+         auto r = inpsrlgraph_m.get_role_filler_units(inproot);
+         //std::cerr<< r.size();
+         auto h = hypsrlgraph_m.get_role_filler_units(hyproot);
+         //std::cerr<< h.size();
+         std::pair<double, double> sentsim;
+         if (inpsrlgraph_m.get_sent_type() != "uemb" ||  hypsrlgraph_m.get_sent_type() != "uemb") {
+            sentsim = (*phrasesim)(r, h, yisi::INP_MODE);
+         } else {
+            auto remb = inpsrlgraph_m.get_role_filler_embs(inproot);
+            auto hemb = hypsrlgraph_m.get_role_filler_embs(hyproot);
+            //std::cerr<< remb.size() <<" " <<hemb.size();
+            sentsim = (*phrasesim)(r, h, remb, hemb, yisi::INP_MODE);
+         }
+         //std::cerr << "sentsim = (" << sentsim.first << "," << sentsim.second << ")";
          inpalignment_m[inproot] = alignment_type(hyproot, sentsim.second);
          if (hypalignment_m.find(hyproot) == hypalignment_m.end()) {
             hypalignment_m[hyproot] = std::vector<std::pair<int, alignment_type> >();
          }
          hypalignment_m[hyproot].push_back(std::make_pair((int)refsrlgraph_m.size(),
-                                          alignment_type(inproot, sentsim.first)));
+            alignment_type(inproot, sentsim.first)));
          auto inppreds = inpsrlgraph_m.get_preds();
          auto hyppreds = hypsrlgraph_m.get_preds();
          maxmatching_t inppredmatch;
@@ -218,14 +257,20 @@ namespace yisi {
             auto inppredid = *it;
             auto inppredspan = inpsrlgraph_m.get_role_span(inppredid);
             if (inppredspan.first != inppredspan.second) {
-               auto inppredphrase = inpsrlgraph_m.get_role_fillers(inppredid);
+               auto inppredphrase = inpsrlgraph_m.get_role_filler_units(inppredid);
                for (auto jt = hyppreds.begin(); jt != hyppreds.end(); jt++) {
                   auto hyppredid = *jt;
                   auto hyppredspan = hypsrlgraph_m.get_role_span(hyppredid);
                   if (hyppredspan.first != hyppredspan.second) {
-                     auto hyppredphrase = hypsrlgraph_m.get_role_fillers(hyppredid);
-                     std::pair<double, double> predsim =
-                        (*phrasesim)(inppredphrase, hyppredphrase, yisi::INP_MODE);
+                     auto hyppredphrase = hypsrlgraph_m.get_role_filler_units(hyppredid);
+                     std::pair<double, double> predsim;
+                     if (inpsrlgraph_m.get_sent_type() != "uemb" ||  hypsrlgraph_m.get_sent_type() != "uemb") {
+                        predsim = (*phrasesim)(inppredphrase, hyppredphrase, yisi::INP_MODE);
+                     } else {
+                        auto ipredemb = inpsrlgraph_m.get_role_filler_embs(inppredid);
+                        auto hpredemb = hypsrlgraph_m.get_role_filler_embs(hyppredid);
+                        predsim = (*phrasesim)(inppredphrase, hyppredphrase, ipredemb, hpredemb, yisi::INP_MODE);
+                     }
                      inppredmatch.add_weight(inppredid, hyppredid, predsim.second);
                      hyppredmatch.add_weight(inppredid, hyppredid, predsim.first);
                   }
@@ -244,12 +289,18 @@ namespace yisi {
             maxmatching_t argmatch;
             for (auto it = inpargs.begin(); it != inpargs.end(); it++) {
                auto inpargid = *it;
-               auto inpargphrase = inpsrlgraph_m.get_role_fillers(inpargid);
+               auto inpargphrase = inpsrlgraph_m.get_role_filler_units(inpargid);
                for (auto jt = hypargs.begin(); jt != hypargs.end(); jt++) {
                   auto hypargid = *jt;
-                  auto hypargphrase = hypsrlgraph_m.get_role_fillers(hypargid);
-                  std::pair<double, double> argsim =
-                     (*phrasesim)(inpargphrase, hypargphrase, yisi::INP_MODE);
+                  auto hypargphrase = hypsrlgraph_m.get_role_filler_units(hypargid);
+                  std::pair<double, double> argsim;
+                  if (inpsrlgraph_m.get_sent_type() != "uemb" ||  hypsrlgraph_m.get_sent_type() != "uemb") {
+                     argsim = (*phrasesim)(inpargphrase, hypargphrase, yisi::INP_MODE);
+                  } else {
+                     auto iargemb = inpsrlgraph_m.get_role_filler_embs(inpargid);
+                     auto hargemb = hypsrlgraph_m.get_role_filler_embs(hypargid);
+                     argsim = (*phrasesim)(inpargphrase, hypargphrase, iargemb, hargemb, yisi::INP_MODE);
+                  }
                   argmatch.add_weight(inpargid, hypargid, argsim.second);
                }
             }
@@ -267,21 +318,27 @@ namespace yisi {
             auto psim = hpr[i].second;
             if (hypalignment_m.find(aligned_hyp_pred) == hypalignment_m.end()) {
                hypalignment_m[aligned_hyp_pred] =
-                                 std::vector<std::pair<int, alignment_type> >();
+                  std::vector<std::pair<int, alignment_type> >();
             }
             hypalignment_m[aligned_hyp_pred].push_back(std::make_pair((int)refsrlgraph_m.size(),
-                                                alignment_type(aligned_inp_pred, psim)));
+               alignment_type(aligned_inp_pred, psim)));
             auto inpargs = inpsrlgraph_m.get_args(aligned_inp_pred);
             auto hypargs = hypsrlgraph_m.get_args(aligned_hyp_pred);
             maxmatching_t argmatch;
             for (auto it = inpargs.begin(); it != inpargs.end(); it++) {
                auto inpargid = *it;
-               auto inpargphrase = inpsrlgraph_m.get_role_fillers(inpargid);
+               auto inpargphrase = inpsrlgraph_m.get_role_filler_units(inpargid);
                for (auto jt = hypargs.begin(); jt != hypargs.end(); jt++) {
                   auto hypargid = *jt;
-                  auto hypargphrase = hypsrlgraph_m.get_role_fillers(hypargid);
-                  std::pair<double, double> argsim =
-                     (*phrasesim)(inpargphrase, hypargphrase, yisi::INP_MODE);
+                  auto hypargphrase = hypsrlgraph_m.get_role_filler_units(hypargid);
+                  std::pair<double, double> argsim;
+                  if (inpsrlgraph_m.get_sent_type() != "uemb" ||  hypsrlgraph_m.get_sent_type() != "uemb") {
+                     argsim = (*phrasesim)(inpargphrase, hypargphrase, yisi::INP_MODE);
+                  } else {
+                     auto iargemb = inpsrlgraph_m.get_role_filler_embs(inpargid);
+                     auto hargemb = hypsrlgraph_m.get_role_filler_embs(hypargid);
+                     argsim = (*phrasesim)(inpargphrase, hypargphrase, iargemb, hargemb, yisi::INP_MODE);
+                  }
                   argmatch.add_weight(inpargid, hypargid, argsim.first);
                }
             }
@@ -292,17 +349,17 @@ namespace yisi {
                auto asim = ar[j].second;
                if (hypalignment_m.find(aligned_hyp_arg) == hypalignment_m.end()) {
                   hypalignment_m[aligned_hyp_arg] =
-                                 std::vector<std::pair<int, alignment_type> >();
+                     std::vector<std::pair<int, alignment_type> >();
                }
                hypalignment_m[aligned_hyp_arg].push_back(std::make_pair((int)refsrlgraph_m.size(),
-                                                alignment_type(aligned_inp_arg, asim)));
+                  alignment_type(aligned_inp_arg, asim)));
             }
          }
       }
    } // align
 
    std::ostream& operator<<(std::ostream& os, const yisi::yisigraph_t& m);
-
+  
 } // yisi
 
 
diff --git a/src/yisiscorer.h b/src/yisiscorer.h
index 94d4d2d..9b97e11 100644
--- a/src/yisiscorer.h
+++ b/src/yisiscorer.h
@@ -30,674 +30,676 @@
 
 namespace yisi {
 
-  struct yisi_options {
-    std::string inpsrl_name_m;
-    std::string inpsrl_path_m;
-    std::string refsrl_name_m;
-    std::string refsrl_path_m;
-    std::string hypsrl_name_m;
-    std::string hypsrl_path_m;
-    std::string labelconfig_path_m;
-    std::string weightconfig_path_m;
-    std::string frameweight_name_m;
-    
-    double alpha_m;
-    double beta_m;
-    
-    void init(com::masaers::cmdlp::parser& p) {
-      using namespace com::masaers::cmdlp;
-      
-      p.add(make_knob(inpsrl_name_m))
-	.fallback("")
-	.desc("Type of input language SRL: [read|mate]")
-	.name("inpsrl-type")
-	;
-      p.add(make_knob(inpsrl_path_m))
-	.fallback("")
-	.desc("[read: path to assert formated parse of input sentences "
-	       "| mate: full path and filename of <srclang>.mplsconfig]")
-	.name("inpsrl-path")
-	;
-      p.add(make_knob(hypsrl_name_m))
-	.fallback("")
-	.desc("Type of output language SRL: [read|mate]")
-	.name("outsrl-type")
-	.name("hypsrl-type")
-	.name("srl-type")
-	;
-      p.add(make_knob(hypsrl_path_m))
-	.fallback("")
-	.desc("[read: path to assert formatted parse output "
-	      "| mate: full path and filename of <tgtlang>.mplsconfig]")
-	.name("outsrl-path")
-	.name("hypsrl-path")
-	.name("srl-path")
-	;
-      p.add(make_knob(refsrl_name_m))
-        .fallback("")
-        .desc("Type of reference SRL (specify only if it is different from the hypothesis SRL): [read|mate]")
-        .name("refsrl-type")
-        ;
-      p.add(make_knob(refsrl_path_m))
-        .fallback("")
-        .desc("[read: path to assert formatted parse reference "
-              "| mate: full path and filename of <tgtlang>.mplsconfig]")
-        .name("refsrl-path")
-        ;
-      p.add(make_knob(labelconfig_path_m))
-	.fallback("")
-	.desc("Path to YiSi SRL role label config file")
-	.name("labelconfig-path")
-	;
-      p.add(make_knob(weightconfig_path_m))
-	.fallback("")
-	.desc("Path to YiSi SRL role label config file (default: "
-	      "<empty string> to use YiSi unsupervised estimation of weight")
-	.name("weightconfig-path")
-	;
-      p.add(make_knob(frameweight_name_m))
-	.fallback("coverage")
-	.desc("Type of frame weight function: [uniform|coverage(default)]")
-	.name("frameweight-type")
-	;
-      p.add(make_knob(beta_m))
-	.fallback(0.0)
-	.desc("Beta value of YiSi [0.0(default)]")
-	.name("beta")
-	;
-      p.add(make_knob(alpha_m))
-	.fallback(0.5)
-	.desc("Ratio of precision & recall in YiSi")
-	.name("alpha")
-	;
-    }
-  }; // struct yisi_options
-  
-  template<class opt_T>
-  class yisiscorer_t {
-  public:
-    typedef opt_T opt_type;
-
-    yisiscorer_t() {}
-
-    yisiscorer_t(opt_T opt) {
-      alpha_m = opt.alpha_m;
-      frameweight_name_m = opt.frameweight_name_m;
-      alpha_m = opt.alpha_m;
-      beta_m = opt.beta_m;
-      
-      int i = 0;
-      if (opt.labelconfig_path_m != "") {
-	std::cerr << "Reading labelconfig from " << opt.labelconfig_path_m << " ... ";
-	std::ifstream LBL(opt.labelconfig_path_m.c_str());
-	if (!LBL) {
-	  std::cerr << "ERROR: Failed to open labelconfig. Exiting..." << std::endl;
-	  exit(1);
-	}
-	while (!LBL.eof()) {
-	  std::string line;
-	  getline(LBL, line);
-	  if (line != "") {
-	    std::istringstream iss(line);
-	    while (!iss.eof()) {
-	      std::string label;
-	      iss >> label;
-	      label_m[label] = i;
-	    }
-	    i++;
-	  }
-	}
-	LBL.close();
-	std::cerr << "Done." << std::endl;
-      }
-      
-      weightconfig_path_m = opt.weightconfig_path_m;
-      if (weightconfig_path_m != ""
-	  && weightconfig_path_m != "lexweight"
-	  && weightconfig_path_m != "uniform") {
-	std::cerr << "Reading weightconfig from " << opt.weightconfig_path_m << " ... ";
-	std::ifstream W(weightconfig_path_m.c_str());
-	if (!W) {
-	  std::cerr << "ERROR: Failed to open weightconfig. Exiting..." << std::endl;
-	  exit(1);
-	}
-	while (!W.eof()) {
-	  double w;
-	  W >> w;
-	  weight_m.push_back(w);
-	}
-	W.close();
-	std::cerr << "Done." << std::endl;
-	if ((int)weight_m.size() != i) {
-	  std::cerr << "ERROR: Number of weights in weightconfig does not match "
-		    << "with number of lines in labelconfig. Exiting..." << std::endl;
-	  exit(1);
-	}
-      } else {
-	for (int j = 0; j < i; j++) {
-	  weight_m.push_back(1.0);
-	}
-      }
-      
-      phrasesim_p = new phrasesim_t<opt_type>(opt);
-      hypsrl_p = new srl_t(opt.hypsrl_name_m, opt.hypsrl_path_m);
-      hypsrl_name_m = opt.hypsrl_name_m;
-      if (opt.refsrl_name_m != ""){
-	refsrl_p = new srl_t(opt.refsrl_name_m, opt.refsrl_path_m);
-      } else {
-	refsrl_p = hypsrl_p;;
+   struct yisi_options {
+      std::string inpsrl_name_m;
+      std::string inpsrl_path_m;
+      std::string refsrl_name_m;
+      std::string refsrl_path_m;
+      std::string hypsrl_name_m;
+      std::string hypsrl_path_m;
+
+      std::string labelconfig_path_m;
+      std::string weightconfig_path_m;
+      std::string frameweight_name_m;
+
+      double alpha_m;
+      double beta_m;
+
+      void init(com::masaers::cmdlp::parser& p) {
+         using namespace com::masaers::cmdlp;
+
+         p.add(make_knob(inpsrl_name_m))
+	   .fallback("")
+	   .desc("Type of input language SRL: [read|mate]")
+	   .name("inpsrl-type")
+	   ;
+         p.add(make_knob(inpsrl_path_m))
+	   .fallback("")
+	   .desc("[read: path to assert formated parse of input sentences "
+	         "| mate: full path and filename of <srclang>.mplsconfig]")
+	   .name("inpsrl-path")
+	   ;
+         p.add(make_knob(hypsrl_name_m))
+	   .fallback("")
+	   .desc("Type of output language SRL: [read|mate]")
+	   .name("outsrl-type")
+	   .name("hypsrl-type")
+	   .name("srl-type")
+	   ;
+         p.add(make_knob(hypsrl_path_m))
+	   .fallback("")
+	   .desc("[read: path to assert formatted parse output "
+	         "| mate: full path and filename of <tgtlang>.mplsconfig]")
+	   .name("outsrl-path")
+	   .name("hypsrl-path")
+	   .name("srl-path")
+    ;
+         p.add(make_knob(refsrl_name_m))
+           .fallback("")
+           .desc("Type of reference SRL (specify only if it is different from "
+                 "the hypothesis SRL): [read|mate]")
+           .name("refsrl-type")
+           ;
+         p.add(make_knob(refsrl_path_m))
+           .fallback("")
+           .desc("[read: path to assert formatted parse reference "
+                 "| mate: full path and filename of <tgtlang>.mplsconfig]")
+           .name("refsrl-path")
+           ;
+         p.add(make_knob(labelconfig_path_m))
+	   .fallback("")
+	   .desc("Path to YiSi SRL role label config file")
+	   .name("labelconfig-path")
+	   ;
+         p.add(make_knob(weightconfig_path_m))
+	   .fallback("")
+	   .desc("Path to YiSi SRL role label config file (default: "
+	         "<empty string> to use YiSi unsupervised estimation of weight")
+	   .name("weightconfig-path")
+	   ;
+         p.add(make_knob(frameweight_name_m))
+	   .fallback("coverage")
+	   .desc("Type of frame weight function: [uniform|coverage(default)]")
+	   .name("frameweight-type")
+	   ;
+         p.add(make_knob(beta_m))
+	   .fallback(0.0)
+	   .desc("Beta value of YiSi [0.0(default)]")
+	   .name("beta")
+	   ;
+         p.add(make_knob(alpha_m))
+	   .fallback(0.5)
+	   .desc("Ratio of precision & recall in YiSi")
+	   .name("alpha")
+	   ;
       }
-      refsrl_name_m = opt.refsrl_name_m;
-      inpsrl_p = new srl_t(opt.inpsrl_name_m, opt.inpsrl_path_m);
-      inpsrl_name_m = opt.inpsrl_name_m;
-    } // yisiscorer_t
-    
-    ~yisiscorer_t() {
-      if (phrasesim_p != NULL) {
-	delete phrasesim_p;
-	phrasesim_p = NULL;
+   }; // struct yisi_options
+
+   template<class opt_T>
+   class yisiscorer_t {
+   public:
+      typedef opt_T opt_type;
+
+      yisiscorer_t() {}
+
+      yisiscorer_t(opt_T opt) {
+         alpha_m = opt.alpha_m;
+         frameweight_name_m = opt.frameweight_name_m;
+         alpha_m = opt.alpha_m;
+         beta_m = opt.beta_m;
+
+         int i = 0;
+         if (opt.labelconfig_path_m != "") {
+            std::cerr << "Reading labelconfig from " << opt.labelconfig_path_m << " ... ";
+            std::ifstream LBL(opt.labelconfig_path_m.c_str());
+            if (!LBL) {
+               std::cerr << "ERROR: Failed to open labelconfig. Exiting..." << std::endl;
+               exit(1);
+            }
+            while (!LBL.eof()) {
+               std::string line;
+               getline(LBL, line);
+               if (line != "") {
+                  std::istringstream iss(line);
+                  while (!iss.eof()) {
+                     std::string label;
+                     iss >> label;
+                     label_m[label] = i;
+                  }
+                  i++;
+               }
+            }
+            LBL.close();
+            std::cerr << "Done." << std::endl;
+         }
+
+         weightconfig_path_m = opt.weightconfig_path_m;
+         if (weightconfig_path_m != ""
+               && weightconfig_path_m != "lexweight"
+               && weightconfig_path_m != "uniform") {
+            std::cerr << "Reading weightconfig from " << opt.weightconfig_path_m << " ... ";
+            std::ifstream W(weightconfig_path_m.c_str());
+            if (!W) {
+               std::cerr << "ERROR: Failed to open weightconfig. Exiting..." << std::endl;
+               exit(1);
+            }
+            while (!W.eof()) {
+               double w;
+               W >> w;
+               weight_m.push_back(w);
+            }
+            W.close();
+            std::cerr << "Done." << std::endl;
+            if ((int)weight_m.size() != i) {
+               std::cerr << "ERROR: Number of weights in weightconfig does not match "
+                  << "with number of lines in labelconfig. Exiting..." << std::endl;
+               exit(1);
+            }
+         } else {
+            for (int j = 0; j < i; j++) {
+               weight_m.push_back(1.0);
+            }
+         }
+
+         phrasesim_p = new phrasesim_t<opt_type>(opt);
+         hypsrl_p = new srl_t(opt.hypsrl_name_m, opt.hypsrl_path_m);
+         hypsrl_name_m = opt.hypsrl_name_m;
+         if (opt.refsrl_name_m != "") {
+            refsrl_p = new srl_t(opt.refsrl_name_m, opt.refsrl_path_m);
+         } else {
+            refsrl_p = hypsrl_p;;
+         }
+         refsrl_name_m = opt.refsrl_name_m;
+         inpsrl_p = new srl_t(opt.inpsrl_name_m, opt.inpsrl_path_m);
+         inpsrl_name_m = opt.inpsrl_name_m;
+      } // yisiscorer_t
+
+      ~yisiscorer_t() {
+         if (phrasesim_p != NULL) {
+            delete phrasesim_p;
+            phrasesim_p = NULL;
+         }
+         if (inpsrl_p != NULL) {
+            delete inpsrl_p;
+            inpsrl_p = NULL;
+         }
+         if (hypsrl_p != NULL) {
+            delete hypsrl_p;
+            hypsrl_p = NULL;
+            if (refsrl_name_m == "") {
+               refsrl_p = NULL;
+            }
+         }
+         if (refsrl_p != NULL) {
+            delete refsrl_p;
+            refsrl_p = NULL;
+         }
       }
-      if (inpsrl_p != NULL) {
-	delete inpsrl_p;
-	inpsrl_p = NULL;
+
+      void writecache() {
+         phrasesim_p->writecache();
       }
-      if (hypsrl_p != NULL) {
-	delete hypsrl_p;
-	hypsrl_p = NULL;
-	if (refsrl_name_m == ""){
-	  refsrl_p = NULL;
-	}
+
+      void readcache() {
+         phrasesim_p->readcache();
       }
-      if (refsrl_p != NULL) {
-	delete refsrl_p;
-	refsrl_p = NULL;
+
+      void estimate_weight(std::vector<srlgraph_t> srls) {
+         for (auto it = srls.begin(); it != srls.end(); it++) {
+            auto preds = it->get_preds();
+            for (auto jt = preds.begin(); jt != preds.end(); jt++) {
+               auto pred_label = it->get_role_label(*jt);
+               if (label_m.find(pred_label) == label_m.end()) {
+                  std::cerr << "ERROR: Unknown predicate label '" << pred_label
+                     << "'. Check your labelconfig. Exiting..." << std::endl;
+                  exit(1);
+               }
+               weight_m[label_m[pred_label]] += 0.25;
+               auto args = it->get_args(*jt);
+               for (auto kt = args.begin(); kt != args.end(); kt++) {
+                  auto arg_label = it->get_role_label(*kt);
+                  if (label_m.find(arg_label) == label_m.end()) {
+                     std::cerr << "ERROR: Unknown argument label '" << arg_label
+                        << "'. Check your labelconfig. Exiting..." << std::endl;
+                     exit(1);
+                  }
+                  weight_m[label_m[arg_label]] += 1.0;
+               }
+            }
+         }
       }
-    }
-    
-    void writecache() {
-      phrasesim_p->writecache();
-    }
-    
-    void readcache() {
-      phrasesim_p->readcache();
-    }
-    
-    void estimate_weight(std::vector<srlgraph_t> srls) {
-      for (auto it = srls.begin(); it != srls.end(); it++) {
-	auto preds = it->get_preds();
-	for (auto jt = preds.begin(); jt != preds.end(); jt++) {
-	  auto pred_label = it->get_role_label(*jt);
-	  if (label_m.find(pred_label) == label_m.end()) {
-	    std::cerr << "ERROR: Unknown predicate label '" << pred_label
-		      << "'. Check your labelconfig. Exiting..." << std::endl;
-	    exit(1);
-	  }
-	  weight_m[label_m[pred_label]] += 0.25;
-	  auto args = it->get_args(*jt);
-	  for (auto kt = args.begin(); kt != args.end(); kt++) {
-	    auto arg_label = it->get_role_label(*kt);
-	    if (label_m.find(arg_label) == label_m.end()) {
-	      std::cerr << "ERROR: Unknown argument label '" << arg_label
-			<< "'. Check your labelconfig. Exiting..." << std::endl;
-	      exit(1);
-	    }
-	    weight_m[label_m[arg_label]] += 1.0;
-	  }
-	}
+
+      void estimate_weight(std::vector<std::vector<srlgraph_t> > msrls) {
+         for (auto it = msrls.begin(); it != msrls.end(); it++) {
+            estimate_weight(*it);
+         }
       }
-    }
-    
-    void estimate_weight(std::vector<std::vector<srlgraph_t> > msrls) {
-      for (auto it = msrls.begin(); it != msrls.end(); it++) {
-	estimate_weight(*it);
+
+      std::vector<srlgraph_t> inpsrlparse(std::vector<sent_t*> inpsents) {
+         //std::cerr << "Tokenizing/SRL-ing the input ...";
+         std::vector<srlgraph_t> result = inpsrl_p->parse(inpsents);
+         //std::cerr << "Done." << std::endl;
+         if (weightconfig_path_m == "") {
+            this->estimate_weight(result);
+         }
+         return result;
       }
-    }
-    
-    std::vector<srlgraph_t> inpsrlparse(std::vector<std::string> inpsents) {
-      //std::cerr << "Tokenizing/SRL-ing the input ...";
-      std::vector<srlgraph_t> result = inpsrl_p->parse(inpsents);
-      //std::cerr << "Done." << std::endl;
-      if (weightconfig_path_m == "") {
-	this->estimate_weight(result);
+
+      std::vector<srlgraph_t> refsrlparse(std::vector<sent_t*> refsents) {
+         //std::cerr << "Tokenizing/SRL-ing the references ... ";
+         std::vector<srlgraph_t> result = refsrl_p->parse(refsents);
+         //std::cerr << "Done." << std::endl;
+         if (weightconfig_path_m == "") {
+            this->estimate_weight(result);
+         }
+         return result;
       }
-      return result;
-    }
-    
-    std::vector<srlgraph_t> refsrlparse(std::vector<std::string> refsents) {
-      //std::cerr << "Tokenizing/SRL-ing the references ... ";
-      std::vector<srlgraph_t> result = refsrl_p->parse(refsents);
-      //std::cerr << "Done." << std::endl;
-      if (weightconfig_path_m == "") {
-	this->estimate_weight(result);
+
+      std::vector<srlgraph_t> hypsrlparse(std::vector<sent_t*> hypsents) {
+         //std::cerr << "Tokenizing/SRL-ing the hypotheses ... ";
+         std::vector<srlgraph_t> result = hypsrl_p->parse(hypsents);
+         //std::cerr << "Done." << std::endl;
+         return result;
       }
-      return result;
-    }
-      
-    std::vector<srlgraph_t> hypsrlparse(std::vector<std::string> hypsents) {
-      //std::cerr << "Tokenizing/SRL-ing the hypotheses ... ";
-      std::vector<srlgraph_t> result = hypsrl_p->parse(hypsents);
-      //std::cerr << "Done." << std::endl;
-      return result;
-    }
-    
-    srlgraph_t hypsrlparse(std::string hypsent) {
-      //std::cerr <<"Tokenizing/SRL-ing the hypothesis ... ";
-      srlgraph_t result = hypsrl_p->parse(hypsent);
-      //std::cerr << "Done." << std::endl;
-      return result;
-    }
-    
-    yisigraph_t align(const std::vector<srlgraph_t> refsrlgraph, const srlgraph_t hypsrlgraph) {
-      //std::cerr << "Creating YiSi graph ... ";
-      yisigraph_t result(refsrlgraph, hypsrlgraph);
-      //std::cerr << "start aligning ... ";
-      result.align(phrasesim_p);
-      //result.print(std::cerr);
-      //std::cerr << "Done." << std::endl;
-      return result;
-    }
-    
-    yisigraph_t align(const std::vector<srlgraph_t> refsrlgraph,
-		      const srlgraph_t hypsrlgraph, const srlgraph_t inpsrlgraph) {
-      //std::cerr << "Creating YiSi graph with input... ";
-      yisigraph_t result(refsrlgraph, hypsrlgraph, inpsrlgraph);
-      //std::cerr << "start aligning ... ";
-      result.align(phrasesim_p);
-      //result.print(std::cerr);
-      //std::cerr << "Done." << std::endl;
-      return result;
-    };
-    
-    double score(yisigraph_t& yg) {
-      double precision = score(yg, yisi::HYP_MODE);
-      double recall = score(yg, yisi::REF_MODE);
-      double yisi = 0.0;
-      if (precision == 0.0 || recall == 0.0) {
-	yisi = 0.0;
-      } else {
-	yisi = (precision * recall) / (alpha_m * precision + (1.0 - alpha_m) * recall);
+
+      srlgraph_t hypsrlparse(sent_t* hypsent) {
+         //std::cerr <<"Tokenizing/SRL-ing the hypothesis ... ";
+         srlgraph_t result = hypsrl_p->parse(hypsent);
+         //std::cerr << "Done." << std::endl;
+         return result;
       }
-      return yisi;
-      //double flat = yg.get_sentsim();
-      //if (mode_m == "flat") {
-      //   return flat;
-      //} else {
-      //   //std::cerr<<"Computing YiSi precision ... ";
-      //   double precision = score(yg, yisi::HYP_MODE);
-      //   //std::cerr<<"Done."<<std::endl;
-      //   //std::cerr<<"Computing YiSi recall ... ";
-      //   double recall = score(yg, yisi::REF_MODE);
-      //   //std::cerr<<"Done."<<std::endl;
-      //   double yisi = 0.0;
-      //   if (precision == 0.0 || recall == 0.0) {
-      //      yisi = 0.0;
-      //   } else {
-      //      yisi = (precision * recall)
-      //             / (alpha_m * precision + (1.0 - alpha_m) * recall);
-      
-      //      if (prfunc_name_m == "f" || prfunc_name_m == "lexexp") {
-      //         yisi = (precision * recall)
-      //                / (alpha_m * precision + (1.0 - alpha_m) * recall);
-      //      } else if (prfunc_name_m == "max") {
-      //         yisi = std::max(precision, recall);
-      //      } else {
-      //         std::cerr
-      //            << "ERROR: unknown precision/recall agg function name. Exiting ..."
-      //            << std::endl;
-      //         exit(1);
-      //      }
-      //   }
-      //   if (mode_m == "yisi" || mode_m == "yisi_flat"
-      //      || mode_m == "features") {
-      //      return yisi;
-      //   } else if (mode_m == "yisi+float") {
-      //      return (yisi + flat) / 2.0;
-      //   } else {
-      //      double w = std::atof(mode_m.c_str());
-      //      return w * yisi + (1 - w) * flat;
-      //   }
-      //}
-    }
-    
-    std::vector<double> features(yisigraph_t& yg) {
-      std::vector<double> result;
-      //double flat =  yg.get_sentsim();
-      //result.push_back(flat);
-      //result.push_back(score(yg));
-      std::vector<double> precision = features(yg, yisi::HYP_MODE);
-      std::vector<double> recall = features(yg, yisi::REF_MODE);
-      for (auto it = precision.begin(); it != precision.end(); it++) {
-	result.push_back(*it);
+
+      yisigraph_t align(const std::vector<srlgraph_t> refsrlgraph, const srlgraph_t hypsrlgraph) {
+         //std::cerr << "Creating YiSi graph ... ";
+         yisigraph_t result(refsrlgraph, hypsrlgraph);
+         //std::cerr << "start aligning ... ";
+         result.align(phrasesim_p);
+         //result.print(std::cerr);
+         //std::cerr << "Done." << std::endl;
+         return result;
       }
-      for (auto it = recall.begin(); it != recall.end(); it++) {
-	result.push_back(*it);
+
+      yisigraph_t align(const std::vector<srlgraph_t> refsrlgraph,
+                        const srlgraph_t hypsrlgraph, const srlgraph_t inpsrlgraph) {
+         //std::cerr << "Creating YiSi graph with input... ";
+         yisigraph_t result(refsrlgraph, hypsrlgraph, inpsrlgraph);
+         //std::cerr << "start aligning ... ";
+         result.align(phrasesim_p);
+         //result.print(std::cerr);
+         //std::cerr << "Done." << std::endl;
+         return result;
+      };
+
+      double score(yisigraph_t& yg) {
+         double precision = score(yg, yisi::HYP_MODE);
+         double recall = score(yg, yisi::REF_MODE);
+         double yisi = 0.0;
+         if (precision == 0.0 || recall == 0.0) {
+            yisi = 0.0;
+         } else {
+            yisi = (precision * recall) / (alpha_m * precision + (1.0 - alpha_m) * recall);
+         }
+         return yisi;
+         //double flat = yg.get_sentsim();
+         //if (mode_m == "flat") {
+         //   return flat;
+         //} else {
+         //   //std::cerr<<"Computing YiSi precision ... ";
+         //   double precision = score(yg, yisi::HYP_MODE);
+         //   //std::cerr<<"Done."<<std::endl;
+         //   //std::cerr<<"Computing YiSi recall ... ";
+         //   double recall = score(yg, yisi::REF_MODE);
+         //   //std::cerr<<"Done."<<std::endl;
+         //   double yisi = 0.0;
+         //   if (precision == 0.0 || recall == 0.0) {
+         //      yisi = 0.0;
+         //   } else {
+         //      yisi = (precision * recall)
+         //             / (alpha_m * precision + (1.0 - alpha_m) * recall);
+
+         //      if (prfunc_name_m == "f" || prfunc_name_m == "lexexp") {
+         //         yisi = (precision * recall)
+         //                / (alpha_m * precision + (1.0 - alpha_m) * recall);
+         //      } else if (prfunc_name_m == "max") {
+         //         yisi = std::max(precision, recall);
+         //      } else {
+         //         std::cerr
+         //            << "ERROR: unknown precision/recall agg function name. Exiting ..."
+         //            << std::endl;
+         //         exit(1);
+         //      }
+         //   }
+         //   if (mode_m == "yisi" || mode_m == "yisi_flat"
+         //      || mode_m == "features") {
+         //      return yisi;
+         //   } else if (mode_m == "yisi+float") {
+         //      return (yisi + flat) / 2.0;
+         //   } else {
+         //      double w = std::atof(mode_m.c_str());
+         //      return w * yisi + (1 - w) * flat;
+         //   }
+         //}
       }
-      return result;
-    }
-    
-  private:
-    double score(yisigraph_t yg, int mode) {
-      //std::cerr <<"Scoring...";
-      auto f = features(yg, mode);
-      double structure = f[weight_m.size()];
-      double flat = f[weight_m.size() + 1];
-      //std::cerr <<"(" << beta_m <<"," <<structure <<"," <<flat <<")";
-      //std::cerr <<"Done."<<std::endl;
-      return beta_m * structure + (1 - beta_m) * flat;
-      
-      //double nom = 0.0;
-      //double denom = 0.0;
-      //if (yg.get_sentlength(mode) == 0.0) {
-      //   return 0.0;
-      //}
-      //if (mode_m == "yisi_flat") {
-      //   if (frameweight_name_m == "coverage") {
-      //      nom += yg.get_sentlength(mode) * yg.get_sentsim();
-      //      denom += yg.get_sentlength(mode);
-      //   } else {
-      //      nom += yg.get_sentsim();
-      //      denom += 1;
-      //   }
-      //}
-      
-      //auto preds = yg.get_preds(mode);
-      
-      //for (auto it = preds.begin(); it != preds.end(); it++) {
-      //   auto predid = *it;
-      //   double sanity_check = yg.get_rolespanlength(predid, mode);
-      //   double predsim = yg.get_alignsim(predid, mode);
-      //   auto predlabel = yg.get_rolelabel(predid, mode);
-      //   double predweight = get_roleweight(yg, predid, mode);
-      
-      //   if (sanity_check > 0) {
-      //      // if (prfunc_name_m=="f" || prfunc_name_m=="max"){
-      //      double fw = yg.get_rolespanlength(predid, mode);
-      //      double fn = 0.0;
-      //      if (predsim >= rolesim_threshold_m) {
-      //         fn = predweight * predsim;
-      //      }
-      //      double fd = predweight;
-      //      auto args = yg.get_args(predid, mode);
-      //      for (auto jt = args.begin(); jt != args.end(); jt++) {
-      //         auto argid = *jt;
-      //         fw += yg.get_rolespanlength(argid, mode);
-      
-      //         auto arglabel = yg.get_rolelabel(argid, mode);
-      //         auto alignlabel = yg.get_alignlabel(argid, mode);
-      //         double argsim = yg.get_alignsim(argid, mode);
-      //         double argweight = get_roleweight(yg, argid, mode);
-      //         if (argsim >= rolesim_threshold_m
-      //            && match(arglabel, alignlabel)) {
-      //            fn += argweight * argsim;
-      //         }
-      //         fd += argweight;
-      //      }
-      //      if (fn > 0 && fd > 0) {
-      //         if (frameweight_name_m == "coverage") {
-      //            nom += fw * (fn / fd);
-      //         } else {
-      //            nom += fn / fd;
-      //         }
-      //      }
-      //      if (frameweight_name_m == "coverage") {
-      //         denom += fw;
-      //      } else {
-      //         denom += 1;
-      //      }
-      //   } else {
-      //      if (predsim >= rolesim_threshold_m) {
-      //         nom = predweight * predsim;
-      //      }
-      //      denom += predweight;
-      //      auto args = yg.get_args(predid, mode);
-      //      for (auto jt = args.begin(); jt != args.end(); jt++) {
-      //         auto argid = *jt;
-      //         auto arglabel = yg.get_rolelabel(argid, mode);
-      //         auto alignlabel = yg.get_alignlabel(argid, mode);
-      //         double argsim = yg.get_alignsim(argid, mode);
-      //         double argweight = get_roleweight(yg, argid, mode);
-      //         if (argsim >= rolesim_threshold_m
-      //            && match(arglabel, alignlabel)) {
-      //            nom += argweight * argsim;
-      //         }
-      //         denom += argweight;
-      //      }
-      
-      //   }
-      
-      //}
-      //}
-      //if (nom > 0 && denom > 0) {
-      //   return nom/denom;
-      //} else {
-      //   return 0.0;
-      //}
-    }
-    
-    std::vector<double> features(yisigraph_t yg, int mode) {
-      if (mode == yisi::REF_MODE) {
-	return rfeatures(yg);
-      } else {
-	return pfeatures(yg);
+
+      std::vector<double> features(yisigraph_t& yg) {
+         std::vector<double> result;
+         //double flat =  yg.get_sentsim();
+         //result.push_back(flat);
+         //result.push_back(score(yg));
+         std::vector<double> precision = features(yg, yisi::HYP_MODE);
+         std::vector<double> recall = features(yg, yisi::REF_MODE);
+         for (auto it = precision.begin(); it != precision.end(); it++) {
+            result.push_back(*it);
+         }
+         for (auto it = recall.begin(); it != recall.end(); it++) {
+            result.push_back(*it);
+         }
+         return result;
       }
-    }
-    
-    void compute_features(yisigraph_t yg, std::vector<double> feats,
-			  double& structure, double& flat, int mode, int refid = -1) {
-      flat = yg.get_sentsim(mode, refid);
-      
-      double tfw = 0.0; // total frame weight
-      //std::vector<double> tsim(weight_m.size(), 0.0); // total similarity by role type
-      //std::vector<double> tcount(weight_m.size(), 0.0); // total count by role type
-      double nom = 0.0;
-      double denom = 0.0;
-      
-      auto preds = yg.get_preds(mode, refid);
-      
-      for (auto it = preds.begin(); it != preds.end(); it++) {
-	std::vector<double> sim(weight_m.size(), 0.0);
-	std::vector<double> count(weight_m.size(), 0.0);
-	auto predid = *it;
-	double sanity_check = yg.get_rolespanlength(predid, mode, refid);
-	double predsim = yg.get_alignsim(predid, mode, refid);
-	auto predlabel = yg.get_rolelabel(predid, mode, refid);
-	double predweight = get_roleweight(yg, predid, mode, refid);
-	
-	if (sanity_check > 0) {
-	  //if (prfunc_name_m=="f" || prfunc_name_m=="max"){
-	  double fw = yg.get_rolespanlength(predid, mode, refid);
-	  double fn = 0.0;
-	  
-	  sim[label_m[predlabel]] += predsim;
-	  fn = predweight * predsim;
-	  
-	  double fd = predweight;
-	  count[label_m[predlabel]] += 1.0;
-	  
-	  auto args = yg.get_args(predid, mode, refid);
-	  for (auto jt = args.begin(); jt != args.end(); jt++) {
-	    auto argid = *jt;
-	    fw += yg.get_rolespanlength(argid, mode, refid);
-	    
-	    auto arglabel = yg.get_rolelabel(argid, mode, refid);
-	    double argsim = 0.0;
-	    yisigraph_t::label_type alignlabel;
-	    if (mode == yisi::HYP_MODE) {
-	      auto alignment = yg.get_hypalignment(argid);
-	      for (auto it = alignment.begin(); it != alignment.end(); it++) {
-		double s = (it->second).second;
-		int id = it->first;
-		yisigraph_t::label_type l;
-		if (id < (int)yg.get_refsize()) {
-		  l = yg.get_rolelabel((it->second).first, yisi::REF_MODE, id);
-		} else {
-		  l = yg.get_rolelabel((it->second).first, yisi::INP_MODE);
-		}
-		if (s > argsim && match(arglabel, l)) {
-		  argsim = s;
-		  alignlabel = l;
-		}
-	      }
-	    } else {
-	      alignlabel = yg.get_alignlabel(argid, mode, refid);
-	      argsim = yg.get_alignsim(argid, mode, refid);
-	    }
-	    
-	    double argweight = get_roleweight(yg, argid, mode, refid);
-	    
-	    sim[label_m[arglabel]] += argsim;
-	    fn += argweight * argsim;
-	    
-	    count[label_m[arglabel]] += 1.0;
-	    fd += argweight;
-	  }
-	  
-	  if (fn > 0 && fd > 0) {
-	    if (frameweight_name_m == "coverage") {
-	      nom += fw * (fn / fd);
-	    } else {
-	      nom += fn / fd;
-	    }
-	  }
-	  if (frameweight_name_m == "coverage") {
-	    denom += fw;
-	  } else {
-	    denom += 1;
-	  }
-	  
-	  for (size_t i = 0; i < feats.size(); i++) {
-	    if (count[i] > 0) {
-	      feats[i] += fw * (sim[i] / count[i]);
-	    }
-	  }
-	  tfw += fw;
-	}
+
+   private:
+      double score(yisigraph_t yg, int mode) {
+         //std::cerr <<"Scoring...";
+         auto f = features(yg, mode);
+         double structure = f[weight_m.size()];
+         double flat = f[weight_m.size() + 1];
+         //std::cerr <<"(" << beta_m <<"," <<structure <<"," <<flat <<")";
+         //std::cerr <<"Done."<<std::endl;
+         return beta_m * structure + (1 - beta_m) * flat;
+
+         //double nom = 0.0;
+         //double denom = 0.0;
+         //if (yg.get_sentlength(mode) == 0.0) {
+         //   return 0.0;
+         //}
+         //if (mode_m == "yisi_flat") {
+         //   if (frameweight_name_m == "coverage") {
+         //      nom += yg.get_sentlength(mode) * yg.get_sentsim();
+         //      denom += yg.get_sentlength(mode);
+         //   } else {
+         //      nom += yg.get_sentsim();
+         //      denom += 1;
+         //   }
+         //}
+
+         //auto preds = yg.get_preds(mode);
+
+         //for (auto it = preds.begin(); it != preds.end(); it++) {
+         //   auto predid = *it;
+         //   double sanity_check = yg.get_rolespanlength(predid, mode);
+         //   double predsim = yg.get_alignsim(predid, mode);
+         //   auto predlabel = yg.get_rolelabel(predid, mode);
+         //   double predweight = get_roleweight(yg, predid, mode);
+
+         //   if (sanity_check > 0) {
+         //      // if (prfunc_name_m=="f" || prfunc_name_m=="max"){
+         //      double fw = yg.get_rolespanlength(predid, mode);
+         //      double fn = 0.0;
+         //      if (predsim >= rolesim_threshold_m) {
+         //         fn = predweight * predsim;
+         //      }
+         //      double fd = predweight;
+         //      auto args = yg.get_args(predid, mode);
+         //      for (auto jt = args.begin(); jt != args.end(); jt++) {
+         //         auto argid = *jt;
+         //         fw += yg.get_rolespanlength(argid, mode);
+
+         //         auto arglabel = yg.get_rolelabel(argid, mode);
+         //         auto alignlabel = yg.get_alignlabel(argid, mode);
+         //         double argsim = yg.get_alignsim(argid, mode);
+         //         double argweight = get_roleweight(yg, argid, mode);
+         //         if (argsim >= rolesim_threshold_m
+         //            && match(arglabel, alignlabel)) {
+         //            fn += argweight * argsim;
+         //         }
+         //         fd += argweight;
+         //      }
+         //      if (fn > 0 && fd > 0) {
+         //         if (frameweight_name_m == "coverage") {
+         //            nom += fw * (fn / fd);
+         //         } else {
+         //            nom += fn / fd;
+         //         }
+         //      }
+         //      if (frameweight_name_m == "coverage") {
+         //         denom += fw;
+         //      } else {
+         //         denom += 1;
+         //      }
+         //   } else {
+         //      if (predsim >= rolesim_threshold_m) {
+         //         nom = predweight * predsim;
+         //      }
+         //      denom += predweight;
+         //      auto args = yg.get_args(predid, mode);
+         //      for (auto jt = args.begin(); jt != args.end(); jt++) {
+         //         auto argid = *jt;
+         //         auto arglabel = yg.get_rolelabel(argid, mode);
+         //         auto alignlabel = yg.get_alignlabel(argid, mode);
+         //         double argsim = yg.get_alignsim(argid, mode);
+         //         double argweight = get_roleweight(yg, argid, mode);
+         //         if (argsim >= rolesim_threshold_m
+         //            && match(arglabel, alignlabel)) {
+         //            nom += argweight * argsim;
+         //         }
+         //         denom += argweight;
+         //      }
+
+         //   }
+
+         //}
+         //}
+         //if (nom > 0 && denom > 0) {
+         //   return nom/denom;
+         //} else {
+         //   return 0.0;
+         //}
       }
-      if (tfw > 0) {
-	for (size_t i = 0; i < feats.size(); i++) {
-	  feats[i] /= tfw;
-	}
+
+      std::vector<double> features(yisigraph_t yg, int mode) {
+         if (mode == yisi::REF_MODE) {
+            return rfeatures(yg);
+         } else {
+            return pfeatures(yg);
+         }
       }
-      
-      //if (prfunc_name_m == "lexexp") {
-      //   for (size_t i = 0; i < tsim.size(); i++) {
-      //      if (tcount[i] > 0) {
-      //         result[i] = tsim[i] / tcount[i];
-      //      }
-      //   }
-      //}
-      if (nom > 0 && denom > 0) {
-	structure = nom / denom;
+
+      void compute_features(yisigraph_t yg, std::vector<double> feats,
+         double& structure, double& flat, int mode, int refid = -1) {
+         flat = yg.get_sentsim(mode, refid);
+
+         double tfw = 0.0; // total frame weight
+         //std::vector<double> tsim(weight_m.size(), 0.0); // total similarity by role type
+         //std::vector<double> tcount(weight_m.size(), 0.0); // total count by role type
+         double nom = 0.0;
+         double denom = 0.0;
+
+         auto preds = yg.get_preds(mode, refid);
+
+         for (auto it = preds.begin(); it != preds.end(); it++) {
+            std::vector<double> sim(weight_m.size(), 0.0);
+            std::vector<double> count(weight_m.size(), 0.0);
+            auto predid = *it;
+            double sanity_check = yg.get_rolespanlength(predid, mode, refid);
+            double predsim = yg.get_alignsim(predid, mode, refid);
+            auto predlabel = yg.get_rolelabel(predid, mode, refid);
+            double predweight = get_roleweight(yg, predid, mode, refid);
+
+            if (sanity_check > 0) {
+               //if (prfunc_name_m=="f" || prfunc_name_m=="max"){
+               double fw = yg.get_rolespanlength(predid, mode, refid);
+               double fn = 0.0;
+
+               sim[label_m[predlabel]] += predsim;
+               fn = predweight * predsim;
+
+               double fd = predweight;
+               count[label_m[predlabel]] += 1.0;
+
+               auto args = yg.get_args(predid, mode, refid);
+               for (auto jt = args.begin(); jt != args.end(); jt++) {
+                  auto argid = *jt;
+                  fw += yg.get_rolespanlength(argid, mode, refid);
+
+                  auto arglabel = yg.get_rolelabel(argid, mode, refid);
+                  double argsim = 0.0;
+                  yisigraph_t::label_type alignlabel;
+                  if (mode == yisi::HYP_MODE) {
+                     auto alignment = yg.get_hypalignment(argid);
+                     for (auto it = alignment.begin(); it != alignment.end(); it++) {
+                        double s = (it->second).second;
+                        int id = it->first;
+                        yisigraph_t::label_type l;
+                        if (id < (int)yg.get_refsize()) {
+                           l = yg.get_rolelabel((it->second).first, yisi::REF_MODE, id);
+                        } else {
+                           l = yg.get_rolelabel((it->second).first, yisi::INP_MODE);
+                        }
+                        if (s > argsim && match(arglabel, l)) {
+                           argsim = s;
+                           alignlabel = l;
+                        }
+                     }
+                  } else {
+                     alignlabel = yg.get_alignlabel(argid, mode, refid);
+                     argsim = yg.get_alignsim(argid, mode, refid);
+                  }
+
+                  double argweight = get_roleweight(yg, argid, mode, refid);
+
+                  sim[label_m[arglabel]] += argsim;
+                  fn += argweight * argsim;
+
+                  count[label_m[arglabel]] += 1.0;
+                  fd += argweight;
+               }
+
+               if (fn > 0 && fd > 0) {
+                  if (frameweight_name_m == "coverage") {
+                     nom += fw * (fn / fd);
+                  } else {
+                     nom += fn / fd;
+                  }
+               }
+               if (frameweight_name_m == "coverage") {
+                  denom += fw;
+               } else {
+                  denom += 1;
+               }
+
+               for (size_t i = 0; i < feats.size(); i++) {
+                  if (count[i] > 0) {
+                     feats[i] += fw * (sim[i] / count[i]);
+                  }
+               }
+               tfw += fw;
+            }
+         }
+         if (tfw > 0) {
+            for (size_t i = 0; i < feats.size(); i++) {
+               feats[i] /= tfw;
+            }
+         }
+
+         //if (prfunc_name_m == "lexexp") {
+         //   for (size_t i = 0; i < tsim.size(); i++) {
+         //      if (tcount[i] > 0) {
+         //         result[i] = tsim[i] / tcount[i];
+         //      }
+         //   }
+         //}
+         if (nom > 0 && denom > 0) {
+            structure = nom / denom;
+         }
       }
-    }
-    
-    std::vector<double> pfeatures(yisigraph_t yg) {
-      std::vector<double> result(weight_m.size(), 0.0);
-      double structure = 0.0;
-      double flat = 0.0;
-      
-      compute_features(yg, result, structure, flat, yisi::HYP_MODE);
-      
-      result.push_back(structure);
-      result.push_back(flat);
-      return result;
-    }
-    
-    std::vector<double> rfeatures(yisigraph_t yg) {
-      std::vector<double> result(weight_m.size(), 0.0);
-      double mflat = 0.0;
-      double mstructure = 0.0;
-      
-      //for all reference
-      for (size_t i = 0; i < yg.get_refsize(); i++) {
-	std::vector<double> feats(weight_m.size(), 0.0);
-	double structure = 0.0;
-	double flat = 0.0;
-	//std::cerr << "Computing recall features for reference #" << i << " ... ";
-	compute_features(yg, feats, structure, flat, yisi::REF_MODE, i);
-	//std::cerr << "Done." << std::endl;
-	if (structure > mstructure) {
-	  mstructure = structure;
-	  result = feats;
-	}
-	if (flat > mflat) {
-	  mflat = flat;
-	}
+
+      std::vector<double> pfeatures(yisigraph_t yg) {
+         std::vector<double> result(weight_m.size(), 0.0);
+         double structure = 0.0;
+         double flat = 0.0;
+
+         compute_features(yg, result, structure, flat, yisi::HYP_MODE);
+
+         result.push_back(structure);
+         result.push_back(flat);
+         return result;
       }
-      
-      //input
-      if (yg.withinp()) {
-	std::vector<double> feats(weight_m.size(), 0.0);
-	double structure = 0.0;
-	double flat = 0.0;
-	//std::cerr << "Computing recall features for input ... ";
-	compute_features(yg, feats, structure, flat, yisi::INP_MODE);
-	//std::cerr << "Done." << std::endl;
-	if (structure > mstructure) {
-	  mstructure = structure;
-	  result = feats;
-	}
-	if (flat > mflat) {
-	  mflat = flat;
-	}
+
+      std::vector<double> rfeatures(yisigraph_t yg) {
+         std::vector<double> result(weight_m.size(), 0.0);
+         double mflat = 0.0;
+         double mstructure = 0.0;
+
+         //for all reference
+         for (size_t i = 0; i < yg.get_refsize(); i++) {
+            std::vector<double> feats(weight_m.size(), 0.0);
+            double structure = 0.0;
+            double flat = 0.0;
+            //std::cerr << "Computing recall features for reference #" << i << " ... ";
+            compute_features(yg, feats, structure, flat, yisi::REF_MODE, i);
+            //std::cerr << "Done." << std::endl;
+            if (structure > mstructure) {
+               mstructure = structure;
+               result = feats;
+            }
+            if (flat > mflat) {
+               mflat = flat;
+            }
+         }
+
+         //input
+         if (yg.withinp()) {
+            std::vector<double> feats(weight_m.size(), 0.0);
+            double structure = 0.0;
+            double flat = 0.0;
+            //std::cerr << "Computing recall features for input ... ";
+            compute_features(yg, feats, structure, flat, yisi::INP_MODE);
+            //std::cerr << "Done." << std::endl;
+            if (structure > mstructure) {
+               mstructure = structure;
+               result = feats;
+            }
+            if (flat > mflat) {
+               mflat = flat;
+            }
+         }
+
+         result.push_back(mstructure);
+         result.push_back(mflat);
+         return result;
       }
-      
-      result.push_back(mstructure);
-      result.push_back(mflat);
-      return result;
-    }
-    
-    bool match(std::string label1, std::string label2) {
-      if (label1 == "U" || label2 == "U") {
-	return false;
-      } else {
-	if (label_m.find(label1) == label_m.end()) {
-	  std::cerr << "ERROR: Unknown srl label '" << label1 << "' in YiSi for matching label 1. "
-		    << "Check your labelconfig. Exiting..." << std::endl;
-	  exit(1);
-	}
-	if (label_m.find(label2) == label_m.end()) {
-	  std::cerr << "ERROR: unknown srl label '" << label2 << "' in yisi for matching label 2. "
-		    << "Check your labelconfig. Exiting..." << std::endl;
-	  exit(1);
-	}
-	return (label_m[label1] == label_m[label2]);
+
+      bool match(std::string label1, std::string label2) {
+         if (label1 == "U" || label2 == "U") {
+            return false;
+         } else {
+            if (label_m.find(label1) == label_m.end()) {
+               std::cerr << "ERROR: Unknown srl label '" << label1 << "' in YiSi for matching label 1. "
+                  << "Check your labelconfig. Exiting..." << std::endl;
+               exit(1);
+            }
+            if (label_m.find(label2) == label_m.end()) {
+               std::cerr << "ERROR: unknown srl label '" << label2 << "' in yisi for matching label 2. "
+                  << "Check your labelconfig. Exiting..." << std::endl;
+               exit(1);
+            }
+            return (label_m[label1] == label_m[label2]);
+         }
       }
-    }
-    
-    double get_roleweight(yisigraph_t yg, size_t roleid, int mode, int refid = -1) {
-      if (weightconfig_path_m == "lexweight") {
-	auto fillers = yg.get_role_fillers(roleid, mode, refid);
-	return phrasesim_p->get_lexweight(fillers, mode);
-      } else {
-	std::string label = yg.get_rolelabel(roleid, mode, refid);
-	if (label_m.find(label) == label_m.end()) {
-	  std::cerr << "ERROR: Unknown srl label '" << label << "' in yisi for get_weight. "
-		    << "Check your labelconfig. Exiting..." << std::endl;
-	  exit(1);
-	}
-	return weight_m[label_m[label]];
+
+      double get_roleweight(yisigraph_t yg, size_t roleid, int mode, int refid = -1) {
+         if (weightconfig_path_m == "lexweight") {
+            auto fillers = yg.get_role_filler_units(roleid, mode, refid);
+            return phrasesim_p->get_lexweight(fillers, mode);
+         } else {
+            std::string label = yg.get_rolelabel(roleid, mode, refid);
+            if (label_m.find(label) == label_m.end()) {
+               std::cerr << "ERROR: Unknown srl label '" << label << "' in yisi for get_weight. "
+                  << "Check your labelconfig. Exiting..." << std::endl;
+               exit(1);
+            }
+            return weight_m[label_m[label]];
+         }
       }
-    }
-    
-    phrasesim_t<opt_T>* phrasesim_p;
-    srl_t* inpsrl_p;
-    srl_t* refsrl_p;
-    srl_t* hypsrl_p;
-    
-    std::string hypsrl_name_m;
-    std::string refsrl_name_m;
-    std::string inpsrl_name_m;
-    std::string weightconfig_path_m;
-    //std::string predweight_name_m;
-    std::string frameweight_name_m;
-    //std::string prfunc_name_m;
-    
-    std::map<std::string, int> label_m;
-    std::vector<double> weight_m;
-    double alpha_m;
-    double beta_m;
-  }; // class yisiscorer_t
+
+      phrasesim_t<opt_T>* phrasesim_p;
+      srl_t* inpsrl_p;
+      srl_t* refsrl_p;
+      srl_t* hypsrl_p;
+
+      std::string hypsrl_name_m;
+      std::string refsrl_name_m;
+      std::string inpsrl_name_m;
+      std::string weightconfig_path_m;
+      //std::string predweight_name_m;
+      std::string frameweight_name_m;
+      //std::string prfunc_name_m;
+
+      std::map<std::string, int> label_m;
+      std::vector<double> weight_m;
+      double alpha_m;
+      double beta_m;
+   }; // class yisiscorer_t
   
 } // yisi
 
diff --git a/src/yisiscorer_test.cpp b/src/yisiscorer_test.cpp
index 5fe033f..8f81b7c 100644
--- a/src/yisiscorer_test.cpp
+++ b/src/yisiscorer_test.cpp
@@ -37,8 +37,8 @@ int main(const int argc, const char* argv[])
 
    string reffile("test_ref.en");
    string hypfile("test_hyp.en");
-   vector<string> refsents = read_file(reffile);
-   vector<string> hypsents = read_file(hypfile);
+   vector<sent_t*> refsents = read_sent("word", reffile);
+   vector<sent_t*> hypsents = read_sent("word", hypfile);
 
    auto r1 = yisi.refsrlparse(refsents);
    auto r2 = yisi.hypsrlparse(hypsents);
@@ -51,4 +51,12 @@ int main(const int argc, const char* argv[])
 
       cout << "YiSi score is:" << yisi.score(m) << endl;
    }
+   for (auto it = refsents.begin(); it != refsents.end(); it++) {
+      delete *it;
+      *it = NULL;
+   }
+   for (auto it = hypsents.begin(); it != hypsents.end(); it++) {
+      delete *it;
+      *it = NULL;
+   }
 }
diff --git a/test/ref/srlgraph_test.out b/test/ref/srlgraph_test.out
index 29f7afe..3961db3 100644
--- a/test/ref/srlgraph_test.out
+++ b/test/ref/srlgraph_test.out
@@ -23,7 +23,7 @@ One thing is certain : these new provisions will have a [AM-MNR negative] [TARGE
 One thing is certain : these new provisions will have a negative impact on [A1 voter] turn - [TARGET out] . 
 [AM-ADV In this sense] , [A0 the measures] [AM-MOD will] [AM-MNR partially] [TARGET undermine] [A1 the American democratic system] . 
 In this sense , the measures will partially undermine the American [A1 democratic] [TARGET system] . 
-Unlike in Canada , the American States are responsible for the organization of [A2 federal] [TARGET elections] [AM-LOC in the United States] . 
+Unlike in Canada , the American States are responsible for the organisation of [A2 federal] [TARGET elections] [AM-LOC in the United States] . 
 It is in this spirit that a [TARGET majority] [A1 of American governments] have passed new laws since 2009 making the registration or voting process more difficult . 
 It is in this spirit that a majority of [A2 American] [TARGET governments] have passed new laws since 2009 making the registration or voting process more difficult . 
 It is in this spirit that [A0 a majority of American governments] have [TARGET passed] [A1 new laws] [AM-TMP since 2009] making the registration or voting process more difficult . 
diff --git a/test/ref/srlutil_test.out b/test/ref/srlutil_test.out
index e3b4bd9..11d8b96 100644
--- a/test/ref/srlutil_test.out
+++ b/test/ref/srlutil_test.out
@@ -1,7 +1,7 @@
 A [A0 Republican] [V strategy] [A1 to counter the re - election of Obama] 
 A Republican strategy to [V counter] [A1 the re - election of Obama] 
 A Republican strategy to counter the re - [V election] [A1 of Obama] 
-[A0 [A2 Republican] [V leaders]] justified their policy by the need to combat electoral fraud . 
+[A2 Republican] [V leaders] justified their policy by the need to combat electoral fraud . 
 [A0 Republican leaders] [V justified] [A1 their policy] [A2 by the need to combat electoral fraud] . 
 Republican leaders justified [A0 their] [V policy] by the need to combat electoral fraud . 
 Republican leaders justified their policy by the [V need] [A1 to combat electoral fraud] . 
@@ -12,32 +12,32 @@ However , [A0 the Brennan Centre] considers this a myth , [V stating] [A1 that e
 However , the Brennan Centre considers this a myth , stating that [A1 electoral] [V fraud] is rarer in the United States than the number of people killed by lightning . 
 However , the Brennan Centre considers this a myth , stating that electoral fraud is rarer in the United States than the [V number] [A1 of people killed by lightning] . 
 However , the Brennan Centre considers this a myth , stating that electoral fraud is rarer in the United States than the number of [A1 people] [V killed] [A0 by lightning] . 
-Indeed , [A0 [A2 Republican] [V lawyers]] identified only 300 cases of electoral fraud in the United States in a decade . 
+Indeed , [A2 Republican] [V lawyers] identified only 300 cases of electoral fraud in the United States in a decade . 
 [AM-DIS Indeed] , [A0 Republican lawyers] [V identified] [A1 only 300 cases of electoral fraud in the United States] [AM-TMP in a decade] . 
 Indeed , Republican lawyers identified only 300 [V cases] [A1 of electoral fraud] in the United States in a decade . 
 Indeed , Republican lawyers identified only 300 cases of [A1 electoral] [V fraud] in the United States in a decade . 
 One thing is certain : [A0 these new provisions] [AM-MOD will] [V have] [A1 a negative impact on voter turn - out] . 
 One thing is certain : these new provisions will have a [AM-MNR negative] [V impact] [A1 on voter turn - out] . 
 One thing is certain : these new provisions will have a negative impact on [A1 voter] turn - [V out] . 
-[AM-ADV In this sense] , [A0 the measures] [AM-MOD will] [AM-MNR partially] [V undermine] [A1 the American democratic system] . 
+[AM-ADV In this sense] [AM-MOD ,] [A0 the measures] [AM-MOD will] [AM-MNR partially] [V undermine] [A1 the American democratic system] [AM-MOD .] 
 In this sense , the measures will partially undermine the American [A1 democratic] [V system] . 
 Unlike in Canada , the American States are responsible for the organization of [A2 federal] [V elections] [AM-LOC in the United States] . 
 It is in this spirit that a [V majority] [A1 of American governments] have passed new laws since 2009 making the registration or voting process more difficult . 
-It is in this spirit that a majority of [A0 [A2 American] [V governments]] have passed new laws since 2009 making the registration or voting process more difficult . 
+It is in this spirit that a majority of [A2 American] [V governments] have passed new laws since 2009 making the registration or voting process more difficult . 
 It is in this spirit that [A0 a majority of American governments] have [V passed] [A1 new laws] [AM-TMP since 2009] making the registration or voting process more difficult . 
-It is in this spirit that [A0 a majority of American governments] have passed [A1 new [V laws]] since 2009 making the registration or voting process more difficult . 
+It is in this spirit that [A0 a majority of American governments] have passed new [V laws] since 2009 making the registration or voting process more difficult . 
 It is in this spirit that [A0 a majority of American governments] have passed new laws since 2009 [V making] [A1 the registration or voting process] [A2 more difficult] . 
 It is in this spirit that a majority of American governments have passed new laws since 2009 making the [V registration] or voting process more difficult . 
 It is in this spirit that a majority of American governments have passed new laws since 2009 making the registration or [V voting] process more difficult . 
 It is in this spirit that a majority of American governments have passed new laws since 2009 making the registration or [A1 voting] [V process] more difficult . 
 [A1 This phenomenon] [V gained] [A2 momentum] [AM-TMP following the November 2010 elections , which saw 675 new Republican representatives added in 26 States] . 
-[A1 This phenomenon] gained [A2 [V momentum]] following the November 2010 elections , which saw 675 new Republican representatives added in 26 States . 
+[A1 This phenomenon] gained [V momentum] following the November 2010 elections , which saw 675 new Republican representatives added in 26 States . 
 This phenomenon gained momentum [V following] [A2 the November 2010 elections , which saw 675 new Republican representatives added in 26 States] . 
-This phenomenon gained momentum following the November 2010 [A0 elections] , [R-A0 which] [V saw] [A1 675 new Republican representatives] [C-A1 added in 26 States] . 
-This phenomenon gained momentum following the November 2010 elections , which saw [A0 675 new [A4 Republican] [V representatives]] added in 26 States . 
+This phenomenon gained momentum following [A0 the November 2010 elections ,] [R-A0 which] [V saw] [A1 675 new Republican representatives] [C-A1 added in 26 States] . 
+This phenomenon gained momentum following the November 2010 elections , which saw 675 new [A4 Republican] [V representatives] added in 26 States . 
 This phenomenon gained momentum following the November 2010 elections , which saw [A1 675 new Republican representatives] [V added] [AM-LOC in 26 States] . 
-[A2 As a [V result] , 180 bills restricting the exercise of the right to vote in 41 States were introduced in 2011 alone .] 
-As a result , 180 [A0 bills] [V restricting] [A1 the exercise of the right to vote in 41 States] were introduced in 2011 alone . 
+[A2 As] a [V result] [A2 , 180 bills restricting the exercise of the right to vote in 41 States were introduced in 2011 alone .] 
+As a result , [A0 180 bills] [V restricting] [A1 the exercise of the right to vote in 41 States] were introduced in 2011 alone . 
 As a result , 180 bills restricting the [V exercise] [A1 of the right to vote in 41 States] were introduced in 2011 alone . 
 As a result , 180 bills restricting the exercise of the [V right] [A1 to vote in 41 States] were introduced in 2011 alone . 
 As a result , 180 bills restricting the exercise of the right to [V vote] [AM-LOC in 41 States] were introduced in 2011 alone . 
diff --git a/test/ref/test_hyp.docyisi0 b/test/ref/test_hyp.docyisi0
index 86b415f..9f47e2b 100644
--- a/test/ref/test_hyp.docyisi0
+++ b/test/ref/test_hyp.docyisi0
@@ -1 +1 @@
-0.645506
+0.693223
diff --git a/test/ref/test_hyp.docyisi1_srl b/test/ref/test_hyp.docyisi1_srl
index fc46d40..cd3cc4c 100644
--- a/test/ref/test_hyp.docyisi1_srl
+++ b/test/ref/test_hyp.docyisi1_srl
@@ -1 +1 @@
-0.639611
+0.637235
diff --git a/test/ref/test_hyp.docyisi1_srl.alt b/test/ref/test_hyp.docyisi1_srl.alt
index a9768df..e39b5b3 100644
--- a/test/ref/test_hyp.docyisi1_srl.alt
+++ b/test/ref/test_hyp.docyisi1_srl.alt
@@ -1 +1 @@
-0.639393
+0.636885
diff --git a/test/ref/test_hyp.docyisi2_srl b/test/ref/test_hyp.docyisi2_srl
index 1ffef82..1de9edb 100644
--- a/test/ref/test_hyp.docyisi2_srl
+++ b/test/ref/test_hyp.docyisi2_srl
@@ -1 +1 @@
-0.0652749
+0.0660683
diff --git a/test/ref/test_hyp.docyisi2_srl.alt b/test/ref/test_hyp.docyisi2_srl.alt
index 462088b..55a7de9 100644
--- a/test/ref/test_hyp.docyisi2_srl.alt
+++ b/test/ref/test_hyp.docyisi2_srl.alt
@@ -1 +1 @@
-0.0641709
+0.0672199
diff --git a/test/ref/test_hyp.sntyisi0 b/test/ref/test_hyp.sntyisi0
index b42e130..df0cbe0 100644
--- a/test/ref/test_hyp.sntyisi0
+++ b/test/ref/test_hyp.sntyisi0
@@ -1,10 +1,10 @@
-0.738498
-0.719384
-0.689899
-0.643572
-0.499597
-0.627008
-0.596041
-0.554946
-0.583918
-0.802202
+0.894586
+0.733148
+0.753002
+0.655633
+0.57693
+0.672231
+0.614407
+0.58164
+0.595404
+0.855247
diff --git a/test/ref/test_hyp.sntyisi1_srl b/test/ref/test_hyp.sntyisi1_srl
index 96cda7d..af29c7f 100644
--- a/test/ref/test_hyp.sntyisi1_srl
+++ b/test/ref/test_hyp.sntyisi1_srl
@@ -1,10 +1,10 @@
-0.859564
-0.691584
-0.645726
-0.632753
-0.459998
-0.592549
-0.556889
-0.546071
-0.546333
-0.864648
+0.858821
+0.695714
+0.645749
+0.633018
+0.458332
+0.577509
+0.557156
+0.534926
+0.543717
+0.86741
diff --git a/test/ref/test_hyp.sntyisi1_srl.alt b/test/ref/test_hyp.sntyisi1_srl.alt
index f9eeee9..f2409b4 100644
--- a/test/ref/test_hyp.sntyisi1_srl.alt
+++ b/test/ref/test_hyp.sntyisi1_srl.alt
@@ -1,10 +1,10 @@
-0.859824
-0.691795
-0.645973
-0.633111
-0.455921
-0.59255
-0.557174
-0.54644
-0.546505
-0.864636
+0.858821
+0.695714
+0.645749
+0.633018
+0.454832
+0.577509
+0.557156
+0.534926
+0.543717
+0.86741
diff --git a/test/ref/test_hyp.sntyisi2_srl b/test/ref/test_hyp.sntyisi2_srl
index ecaa858..121144b 100644
--- a/test/ref/test_hyp.sntyisi2_srl
+++ b/test/ref/test_hyp.sntyisi2_srl
@@ -1,10 +1,10 @@
-0.0464296
-0.0116361
-0.0696774
+0.0352922
+0.0116406
+0.07006
 0.0665215
-0.0274319
-0.0927175
-0.00336682
+0.0273455
+0.0937853
+0.0033759
 0.0519643
-0.141262
-0.141742
+0.141268
+0.159431
diff --git a/test/ref/test_hyp.sntyisi2_srl.alt b/test/ref/test_hyp.sntyisi2_srl.alt
index ccab905..c2b6a05 100644
--- a/test/ref/test_hyp.sntyisi2_srl.alt
+++ b/test/ref/test_hyp.sntyisi2_srl.alt
@@ -1,10 +1,10 @@
-0.0354018
-0.0116361
-0.0696774
+0.0468075
+0.0116406
+0.07006
 0.0665215
-0.0274319
-0.0927175
-0.00336682
+0.0273455
+0.0937853
+0.0033759
 0.0519643
-0.141262
-0.14173
+0.141268
+0.159431
diff --git a/test/ref/test_ref.en.srl b/test/ref/test_ref.en.srl
index cf5fe83..fb81dd8 100644
--- a/test/ref/test_ref.en.srl
+++ b/test/ref/test_ref.en.srl
@@ -1,7 +1,7 @@
 0: A [A0 Republican] [V strategy] [A1 to counter the re - election of Obama] 
 0: A Republican strategy to [V counter] [A1 the re - election of Obama] 
 0: A Republican strategy to counter the re - [V election] [A1 of Obama] 
-1: [A0 [A2 Republican] [V leaders]] justified their policy by the need to combat electoral fraud . 
+1: [A2 Republican] [V leaders] justified their policy by the need to combat electoral fraud . 
 1: [A0 Republican leaders] [V justified] [A1 their policy] [A2 by the need to combat electoral fraud] . 
 1: Republican leaders justified [A0 their] [V policy] by the need to combat electoral fraud . 
 1: Republican leaders justified their policy by the [V need] [A1 to combat electoral fraud] . 
@@ -12,33 +12,33 @@
 2: However , the Brennan Centre considers this a myth , stating that [A1 electoral] [V fraud] is rarer in the United States than the number of people killed by lightning . 
 2: However , the Brennan Centre considers this a myth , stating that electoral fraud is rarer in the United States than the [V number] [A1 of people killed by lightning] . 
 2: However , the Brennan Centre considers this a myth , stating that electoral fraud is rarer in the United States than the number of [A1 people] [V killed] [A0 by lightning] . 
-3: Indeed , [A0 [A2 Republican] [V lawyers]] identified only 300 cases of electoral fraud in the United States in a decade . 
+3: Indeed , [A2 Republican] [V lawyers] identified only 300 cases of electoral fraud in the United States in a decade . 
 3: [AM-DIS Indeed] , [A0 Republican lawyers] [V identified] [A1 only 300 cases of electoral fraud in the United States] [AM-TMP in a decade] . 
 3: Indeed , Republican lawyers identified only 300 [V cases] [A1 of electoral fraud] in the United States in a decade . 
 3: Indeed , Republican lawyers identified only 300 cases of [A1 electoral] [V fraud] in the United States in a decade . 
 4: One thing is certain : [A0 these new provisions] [AM-MOD will] [V have] [A1 a negative impact on voter turn - out] . 
 4: One thing is certain : these new provisions will have a [AM-MNR negative] [V impact] [A1 on voter turn - out] . 
 4: One thing is certain : these new provisions will have a negative impact on [A1 voter] turn - [V out] . 
-5: [AM-ADV In this sense] , [A0 the measures] [AM-MOD will] [AM-MNR partially] [V undermine] [A1 the American democratic system] . 
+5: [AM-ADV In this sense] [AM-MOD ,] [A0 the measures] [AM-MOD will] [AM-MNR partially] [V undermine] [A1 the American democratic system] [AM-MOD .] 
 5: In this sense , the measures will partially undermine the American [A1 democratic] [V system] . 
 6: Unlike in Canada , the American States are responsible for the [V organisation] [A1 of federal elections in the United States] . 
 6: Unlike in Canada , the American States are responsible for the organisation of [A2 federal] [V elections] [AM-LOC in the United States] . 
 7: It is in this spirit that a [V majority] [A1 of American governments] have passed new laws since 2009 making the registration or voting process more difficult . 
-7: It is in this spirit that a majority of [A0 [A2 American] [V governments]] have passed new laws since 2009 making the registration or voting process more difficult . 
+7: It is in this spirit that a majority of [A2 American] [V governments] have passed new laws since 2009 making the registration or voting process more difficult . 
 7: It is in this spirit that [A0 a majority of American governments] have [V passed] [A1 new laws] [AM-TMP since 2009] making the registration or voting process more difficult . 
-7: It is in this spirit that [A0 a majority of American governments] have passed [A1 new [V laws]] since 2009 making the registration or voting process more difficult . 
+7: It is in this spirit that [A0 a majority of American governments] have passed new [V laws] since 2009 making the registration or voting process more difficult . 
 7: It is in this spirit that [A0 a majority of American governments] have passed new laws since 2009 [V making] [A1 the registration or voting process] [A2 more difficult] . 
 7: It is in this spirit that a majority of American governments have passed new laws since 2009 making the [V registration] or voting process more difficult . 
 7: It is in this spirit that a majority of American governments have passed new laws since 2009 making the registration or [V voting] process more difficult . 
 7: It is in this spirit that a majority of American governments have passed new laws since 2009 making the registration or [A1 voting] [V process] more difficult . 
 8: [A1 This phenomenon] [V gained] [A2 momentum] [AM-TMP following the November 2010 elections , which saw 675 new Republican representatives added in 26 States] . 
-8: [A1 This phenomenon] gained [A2 [V momentum]] following the November 2010 elections , which saw 675 new Republican representatives added in 26 States . 
+8: [A1 This phenomenon] gained [V momentum] following the November 2010 elections , which saw 675 new Republican representatives added in 26 States . 
 8: This phenomenon gained momentum [V following] [A2 the November 2010 elections , which saw 675 new Republican representatives added in 26 States] . 
-8: This phenomenon gained momentum following the November 2010 [A0 elections] , [R-A0 which] [V saw] [A1 675 new Republican representatives] [C-A1 added in 26 States] . 
-8: This phenomenon gained momentum following the November 2010 elections , which saw [A0 675 new [A4 Republican] [V representatives]] added in 26 States . 
+8: This phenomenon gained momentum following [A0 the November 2010 elections ,] [R-A0 which] [V saw] [A1 675 new Republican representatives] [C-A1 added in 26 States] . 
+8: This phenomenon gained momentum following the November 2010 elections , which saw 675 new [A4 Republican] [V representatives] added in 26 States . 
 8: This phenomenon gained momentum following the November 2010 elections , which saw [A1 675 new Republican representatives] [V added] [AM-LOC in 26 States] . 
-9: [A2 As a [V result] , 180 bills restricting the exercise of the right to vote in 41 States were introduced in 2011 alone .] 
-9: As a result , 180 [A0 bills] [V restricting] [A1 the exercise of the right to vote in 41 States] were introduced in 2011 alone . 
+9: [A2 As] a [V result] [A2 , 180 bills restricting the exercise of the right to vote in 41 States were introduced in 2011 alone .] 
+9: As a result , [A0 180 bills] [V restricting] [A1 the exercise of the right to vote in 41 States] were introduced in 2011 alone . 
 9: As a result , 180 bills restricting the [V exercise] [A1 of the right to vote in 41 States] were introduced in 2011 alone . 
 9: As a result , 180 bills restricting the exercise of the [V right] [A1 to vote in 41 States] were introduced in 2011 alone . 
 9: As a result , 180 bills restricting the exercise of the right to [V vote] [AM-LOC in 41 States] were introduced in 2011 alone . 
diff --git a/test/ref/test_ref.en.srl.alt b/test/ref/test_ref.en.srl.alt
index 445d76e..d7ad715 100644
--- a/test/ref/test_ref.en.srl.alt
+++ b/test/ref/test_ref.en.srl.alt
@@ -1,7 +1,7 @@
 0: A [A0 Republican] [V strategy] [A1 to counter the re - election of Obama] 
 0: A Republican strategy to [V counter] [A1 the re - election of Obama] 
 0: A Republican strategy to counter the re - [V election] [A1 of Obama] 
-1: [A0 [A2 Republican] [V leaders]] justified their policy by the need to combat electoral fraud . 
+1: [A2 Republican] [V leaders] justified their policy by the need to combat electoral fraud . 
 1: [A0 Republican leaders] [V justified] [A1 their policy] [A2 by the need to combat electoral fraud] . 
 1: Republican leaders justified [A0 their] [V policy] by the need to combat electoral fraud . 
 1: Republican leaders justified their policy by the [V need] [A1 to combat electoral fraud] . 
@@ -12,33 +12,33 @@
 2: However , the Brennan Centre considers this a myth , stating that [A1 electoral] [V fraud] is rarer in the United States than the number of people killed by lightning . 
 2: However , the Brennan Centre considers this a myth , stating that electoral fraud is rarer in the United States than the [V number] [A1 of people killed by lightning] . 
 2: However , the Brennan Centre considers this a myth , stating that electoral fraud is rarer in the United States than the number of [A1 people] [V killed] [A0 by lightning] . 
-3: Indeed , [A0 [A2 Republican] [V lawyers]] identified only 300 cases of electoral fraud in the United States in a decade . 
+3: Indeed , [A2 Republican] [V lawyers] identified only 300 cases of electoral fraud in the United States in a decade . 
 3: [AM-DIS Indeed] , [A0 Republican lawyers] [V identified] [A1 only 300 cases of electoral fraud in the United States] [AM-TMP in a decade] . 
 3: Indeed , Republican lawyers identified only 300 [V cases] [A1 of electoral fraud] in the United States in a decade . 
 3: Indeed , Republican lawyers identified only 300 cases of [A1 electoral] [V fraud] in the United States in a decade . 
 4: One thing is certain : [A0 these new provisions] [AM-MOD will] [V have] [A1 a negative impact on voter turn - out] . 
 4: One thing is certain : these new provisions will have a [AM-MNR negative] [V impact] [A1 on voter turn - out] . 
-4: One thing is certain : these new provisions will have a negative impact on [A1 voter] [A1 turn -] [V out] . 
-5: [AM-ADV In this sense] , [A0 the measures] [AM-MOD will] [AM-MNR partially] [V undermine] [A1 the American democratic system] . 
+4: One thing is certain : these new provisions will have a negative impact on [A1 voter turn -] [V out] . 
+5: [AM-ADV In this sense] [AM-MOD ,] [A0 the measures] [AM-MOD will] [AM-MNR partially] [V undermine] [A1 the American democratic system] [AM-MOD .] 
 5: In this sense , the measures will partially undermine the American [A1 democratic] [V system] . 
 6: Unlike in Canada , the American States are responsible for the [V organisation] [A1 of federal elections in the United States] . 
 6: Unlike in Canada , the American States are responsible for the organisation of [A2 federal] [V elections] [AM-LOC in the United States] . 
 7: It is in this spirit that a [V majority] [A1 of American governments] have passed new laws since 2009 making the registration or voting process more difficult . 
-7: It is in this spirit that a majority of [A0 [A2 American] [V governments]] have passed new laws since 2009 making the registration or voting process more difficult . 
+7: It is in this spirit that a majority of [A2 American] [V governments] have passed new laws since 2009 making the registration or voting process more difficult . 
 7: It is in this spirit that [A0 a majority of American governments] have [V passed] [A1 new laws] [AM-TMP since 2009] making the registration or voting process more difficult . 
-7: It is in this spirit that [A0 a majority of American governments] have passed [A1 new [V laws]] since 2009 making the registration or voting process more difficult . 
+7: It is in this spirit that [A0 a majority of American governments] have passed new [V laws] since 2009 making the registration or voting process more difficult . 
 7: It is in this spirit that [A0 a majority of American governments] have passed new laws since 2009 [V making] [A1 the registration or voting process] [A2 more difficult] . 
 7: It is in this spirit that a majority of American governments have passed new laws since 2009 making the [V registration] or voting process more difficult . 
 7: It is in this spirit that a majority of American governments have passed new laws since 2009 making the registration or [V voting] process more difficult . 
 7: It is in this spirit that a majority of American governments have passed new laws since 2009 making the registration or [A1 voting] [V process] more difficult . 
 8: [A1 This phenomenon] [V gained] [A2 momentum] [AM-TMP following the November 2010 elections , which saw 675 new Republican representatives added in 26 States] . 
-8: [A1 This phenomenon] gained [A2 [V momentum]] following the November 2010 elections , which saw 675 new Republican representatives added in 26 States . 
+8: [A1 This phenomenon] gained [V momentum] following the November 2010 elections , which saw 675 new Republican representatives added in 26 States . 
 8: This phenomenon gained momentum [V following] [A2 the November 2010 elections , which saw 675 new Republican representatives added in 26 States] . 
-8: This phenomenon gained momentum following the November 2010 [A0 elections] , [R-A0 which] [V saw] [A1 675 new Republican representatives] [C-A1 added in 26 States] . 
-8: This phenomenon gained momentum following the November 2010 elections , which saw [A0 675 new [A4 Republican] [V representatives]] added in 26 States . 
+8: This phenomenon gained momentum following [A0 the November 2010 elections ,] [R-A0 which] [V saw] [A1 675 new Republican representatives] [C-A1 added in 26 States] . 
+8: This phenomenon gained momentum following the November 2010 elections , which saw 675 new [A4 Republican] [V representatives] added in 26 States . 
 8: This phenomenon gained momentum following the November 2010 elections , which saw [A1 675 new Republican representatives] [V added] [AM-LOC in 26 States] . 
-9: [A2 As a [V result] , 180 bills restricting the exercise of the right to vote in 41 States were introduced in 2011 alone .] 
-9: As a result , 180 [A0 bills] [V restricting] [A1 the exercise of the right to vote in 41 States] were introduced in 2011 alone . 
+9: [A2 As] a [V result] [A2 , 180 bills restricting the exercise of the right to vote in 41 States were introduced in 2011 alone .] 
+9: As a result , [A0 180 bills] [V restricting] [A1 the exercise of the right to vote in 41 States] were introduced in 2011 alone . 
 9: As a result , 180 bills restricting the [V exercise] [A1 of the right to vote in 41 States] were introduced in 2011 alone . 
 9: As a result , 180 bills restricting the exercise of the [V right] [A1 to vote in 41 States] were introduced in 2011 alone . 
 9: As a result , 180 bills restricting the exercise of the right to [V vote] [AM-LOC in 41 States] were introduced in 2011 alone . 
diff --git a/test/ref/test_yisi_0.out b/test/ref/test_yisi_0.out
index f64674d..c3a8c65 100644
--- a/test/ref/test_yisi_0.out
+++ b/test/ref/test_yisi_0.out
@@ -1,7 +1,9 @@
 Constructing lcs lexsim model
 Learning lex weight from test_ref.en ... Done.
-Tokenizing/SRL-ing hyp ... Done.
-Tokenizing/SRL-ing ref ... Done.
+Reading hyp sents... Done.
+Reading ref sents... Done.
+Creating hyp srlgraphs... Done.
+Creating ref srlgraphs... Done.
 Evaluating line 1
 Evaluating line 2
 Evaluating line 3
diff --git a/test/ref/test_yisi_1.out b/test/ref/test_yisi_1.out
index 754096a..65e8327 100644
--- a/test/ref/test_yisi_1.out
+++ b/test/ref/test_yisi_1.out
@@ -2,8 +2,10 @@ Reading w2v text model from mini.d300.en
 Size of voc: 500 Dimension: 300
 Finished reading w2v model.
 Learning lex weight from test_ref.en ... Done.
-Tokenizing/SRL-ing hyp ... Done.
-Tokenizing/SRL-ing ref ... Done.
+Reading hyp sents... Done.
+Reading ref sents... Done.
+Creating hyp srlgraphs... Done.
+Creating ref srlgraphs... Done.
 Evaluating line 1
 Evaluating line 2
 Evaluating line 3
diff --git a/test/ref/test_yisi_1_srl.out b/test/ref/test_yisi_1_srl.out
index 90d4f47..796deca 100644
--- a/test/ref/test_yisi_1_srl.out
+++ b/test/ref/test_yisi_1_srl.out
@@ -29,8 +29,10 @@ Cluster       null
 Loading pipeline from /home/das011/u/sandboxes/mateplus/models/srl-EMNLP14+fs-eng.model
 Loading reranker from /home/das011/u/sandboxes/mateplus/models/srl-EMNLP14+fs-eng.model
 Done.
-Tokenizing/SRL-ing hyp ... Done.
-Tokenizing/SRL-ing ref ... Done.
+Reading hyp sents... Done.
+Reading ref sents... Done.
+Creating hyp srlgraphs... Done.
+Creating ref srlgraphs... Done.
 Evaluating line 1
 Evaluating line 2
 Evaluating line 3
diff --git a/test/ref/test_yisi_2.out b/test/ref/test_yisi_2.out
index e56b9ca..c648453 100644
--- a/test/ref/test_yisi_2.out
+++ b/test/ref/test_yisi_2.out
@@ -6,8 +6,10 @@ Size of voc: 500 Dimension: 300
 Finished reading w2v model.
 Learning lex weight from test_hyp.en ... Done.
 Learning lex weight from test_inp.de ... Done.
-Tokenizing/SRL-ing hyp ... Done.
-Tokenizing/SRL-ing inp ... Done.
+Reading hyp sents... Done.
+Reading inp sents... Done.
+Creating hyp srlgraphs... Done.
+Creating inp srlgraphs... Done.
 Evaluating line 1
 Evaluating line 2
 Evaluating line 3
diff --git a/test/ref/test_yisi_2_srl.out b/test/ref/test_yisi_2_srl.out
index c45f324..6913024 100644
--- a/test/ref/test_yisi_2_srl.out
+++ b/test/ref/test_yisi_2_srl.out
@@ -57,8 +57,10 @@ Cluster       null
 Loading pipeline from /home/das011/u/sandboxes/mateplus/models/srl-EMNLP14+fs-ger.model
 Loading reranker from /home/das011/u/sandboxes/mateplus/models/srl-EMNLP14+fs-ger.model
 Done.
-Tokenizing/SRL-ing hyp ... Done.
-Tokenizing/SRL-ing inp ... Done.
+Reading hyp sents... Done.
+Reading inp sents... Done.
+Creating hyp srlgraphs... Done.
+Creating inp srlgraphs... Done.
 Evaluating line 1
 Evaluating line 2
 Evaluating line 3