Skip to content

Config file format

Jorge Heleno edited this page Oct 3, 2016 · 1 revision

The file must have the following strucute:

operator_id INPUT_OPS source_op_id1 | filepath1, ... , source_op_idn | filepathn REP_FACT repl_factor ROUTING primary | hashing | random ADDRESS URL1, ... , _URL_REPL_FACT OPERATOR_SPEC operator_type operator_param_1, ... , operator_param_n

  • operator_id is the integer that identifies the operator;
  • The INPUT_OPS parameter specifies a list of operator identifiers that provide the INPUT_OPS, but since our program can also read configurations from files filepaths can also be used;
  • The REP_FACT parameter indicates how many replicas of this operator will be activated;
  • The ROUTING policy specifies how the consumed tuples should be distributed among the replicas. It can be one of three values: 1. Primary: tuples are output to the primary replica of the operator; 2. Random: tuples are output to a random replica; 3. Hashing(field_id):tuples are output to a replica according to an hashing function.
  • The ADDRESS parameter specifies a list of all the replicas URLs. These URLs have the format tcp://<machine-ip>:/op(choose any port but 10000 and 10001 which are reserved for the PuppetMaster;
  • The OPERATOR_SPEC indicates the tuple transformation being performed at this operator. These can be specified by the following:
  1. UNIQ field_number: emit the tuple again if field_number is unique;
  2. COUNT: emit the number of seen tuples;
  3. DUP: emit the tuple as is in input;
  4. FILTER _field_number,condition,value: emit the input tuple if field number is larger(”>”),smaller(”<”) or equal(”=”) than value
  5. CUSTOM dll, class, methid: send each tuple it receives in the form of a list of strings, to a custom method of a class within a specific class library, dll and outputs the tuples returned.
Clone this wiki locally