performance.tex

\chapter{Performance Objectives}
\label{chap:perf}

Bilateral teleoperation problem is probably one of the most difficult problems in the control field due to its 
subjective nature involving the human comfort and liking. However, to put it bluntly, the experts are not helping 
either. In other words, most of the \enquote{good performance} motivations stems from first-principles modeling 
but not out of the user experience. 
Some literature argue that since most of the tools we utilize are (almost-)lossless, say, a 
screwdriver or even a simple stick, it's natural to seek for a passive bilateral teleoperation system (though if we
insist on this analogy, a lossless system should be sought after) that ideally behaves like a rigid transmission mechanism. In the work of 
Daniel and McAree \cite{danielmcaree}, it has been strongly recommended that one should focus completely on the underlying physics and in 
fact the authors boldly established an essential limit to what can be achieved by bilateral teleoperation:
\begin{displayquote}[{\cite{danielmcaree}}][.]
Although our motivation comes from the problem of building teleoperators to perform tasks of this sort, there is much here that we feel is 
common to all teleoperation. The performance of every force-reflecting system is ultimately governed by dynamic interactions between the 
master, the slave, the human operator, and the environment. For some applications different effects may dominate; e.g., transmission delays
limit what can be achieved in space applications of telerobotics ([...]). But one can never achieve better performance than that determined 
by rudimentary physics. For this reason, we call the limits of performance examined here \emph{fundamental limits of performance}
\end{displayquote}
This statement summarizes perfectly what the contemporary bilateral teleoperation literature promotes. The reason why we strongly 
disagree with this statement should be evident at the end of this chapter and later in \Cref{chap:application}. Still, to provide a contextual 
introduction to the discussion, we first remind that the underlying physics does not only consist of two distant robots interacting with 
their corresponding surroundings. The actual physiscs involve the human liking and that gives us the freedom to display whatever is considered 
to be \enquote{cool, nice, crisp, real, helpful} by the human operator. It doesn't matter if we are off by \SI{10}{\newton} or some other reflected 
quantity is not in accordance with the actual measurement; as long as the human operator is happy with the result in terms of immersion and 
touch sensation, we are done. There is nothing fundamental in terms of technological limits of performance and there is nothing wrong with 
approximating (or even altering) the reality to achieve the required technology. In fact in our opinion, this is an instance of the common academic 
practice; obfuscation by purism.

In order to justify such fundamental limitations, we have two missing ingredients in \cite{danielmcaree}; the required precise tools that
do not bring in any lossy simplifications and the exact mechanism described completely by the actual physics. Unfortunately, their analysis 
rely completely on LTI root loci tools and human preference is openly skipped. Therefore their results can at best be the fundamental 
limitations of what the literature claims the bilateral teleoperation is.

In general, there is no established consensus on what makes a teleoperation system \enquote{good}. Quite the contrary, 
this question is openly and unambiguously avoided in some well-acknowledged articles. Instead some possibilities are 
proposed and rigorously pursued to the end without actually validating if these possibilities reflect our intention. 
Therefore, the conclusions that these studies arrive at are the implications of their initial hypotheses. However, 
the studies that follow these publications do not take this crucial detail into account and proceed as if these 
performance objectives are indeed the ultimate goals. We have to claim that the motivation of most of the studies 
given in the literature erroneously put emphasis on performance objectives that are at best suggestions and are often 
questionable. 

On the other hand, there is a different school that focuses only on stability of the teleoperation system in the 
face of human, environment, communication line, quantization and many more uncertainty/perturbation sources. 
Especially some nonlinear control studies do not even bother to define performance criteria. This view simply 
regards the human/environment as perturbations for our precious robotic systems and neglects the \enquote{\emph{%
reason d'etre}} of the very problem that is under consideration. In our opinion, operator perception is the 
indispensable performance objective and can not be overlooked. But it is also very difficult to quantify. More 
importantly, it is beyond the scope and expertise of control theory (though with certain overlap) to find the relevant 
objectives. Other experts of the related fields need to contribute from a technological point of view in 
contrast with a pure physiological point of view and, in fact,  should guide the control theorists and practitioners 
towards the relevant issues. This lack of performance specifications is the main reason why we have left out a considerable 
body of research out in our literature survey.

We strongly believe that the contemporary bilateral teleoperation control results, including this thesis, 
cannot and thus should not claim a comprehensive understanding of a good and useful bilateral teleoperation 
system. Because we just don't know, yet.

\section{Types of Performance}

In terms of the quality of the force-feedback, there are a few leading choices of methodological performance 
definitions. The widely accepted \enquote{transparency} stems from the ideal case of lossless, undistorted
exact replica of the remote side physics at the local site. Hence the ultimate goal is selected to be 
faithfully representing the remote site motion and allowing the user intervene just as good as s/he is operating 
directly at the remote site. Let us provide some background how transparency is often provided and/or motivated. We
choose to follow \cite{hirchebookchap} for no particular reason other than the manuscript being the introduction/survey 
chapter of a recent collection of \emph{Advances in Telerobotics} (but alternatively \cite{dudragne,yokokohjiyoshikawa,hannaford89} and many 
others can be followed too); first we again recognize the motivation for modeling via $n$-ports

\begin{displayquote}
For the analysis and control synthesis the modeling of the bilateral teleoperation system as interconnection of two-ports, [...], is 
convenient. 
\end{displayquote}
Then, the intuitive definition of transparency is provided. We draw the attention to the 
transition from the informal description to the technical formulation of transparency. 

\begin{displayquote}[][...]
Transparency of the telerobotic system is the major goal in bilateral control
architecture design.
\begin{define*}[Transparency] The telerobotic system is transparent if the human operator feels as if directly interacting with the (remote) 
task [10]\footnote{Reference \cite{rajuverg} of this thesis.}
\end{define*}
Formally, transparency is achieved if the transmitted and the environment impedances match [11]\footnote{Reference \cite{lawrence} of this
thesis.} as also indicated above
\[
Z_t=Z_e
\]
or alternatively if HSI (master) and teleoperator (slave) movements are equal and the force displayed to 
the human operator is exactly the reaction force from the environment [12]\footnote{Reference \cite{yokokohjiyoshikawa} of this thesis.}
\[
x_h = x_e\text{ and }f_h = f_e
\]
Transparency in this sense is in practice not achievable as the device dynamics comprising inertia and friction cannot completely 
cancelled by control. Communication effects, especially time delay, severely degrade the achievable transparency. The development of 
quantitative measures is part of the transparency analysis
\end{displayquote}
This transition is practically in all documents which treat transparency as the major goal in the literature. In particular human 
\emph{feel} suddenly disappears from the picture and only the physics between the tips of robotic devices are considered. We have no source 
in the literature that avoids or at least mentions this interesting omission. Let us first give a detailed discussion of transparency to be 
able to clarify what is exactly omitted since there are a few layers of omission in such arguments. 


\subsection{Transparency}
Typically, we classify materials as opaque or transparent based on how good we can see through an item made up of that 
particular material. If we simply replace the act of seeing the other side with touching the remote location, the 
less distorted a system transmits the remote motion to the local site, the more transparent the system is. Hence the 
term \emph{transparency}. This is defined in \cite{lawrence,yokokohjiyoshikawa} independently. In fact, the notion of 
transparency is of three ideal response definitions in \cite{yokokohjiyoshikawa}. Resuming the notation from 
\Cref{chap:litsurvey}, a perfect or ideal transparent 2-port network admits the hybrid matrix 
\[
\pmatr{v_{hum}\\f_{hum}}=\pmatr{0&I\\-I&0}\pmatr{f_{env}\\v_{env}}
\]
If we wish to translate this into a control theoretical performance objective, we have essentially two options. 
First option is minimizing the difference between the measured force/velocity signals of the actual system and the force/velocity signals 
of the hypothetical perfectly transparent system would have exhibited in that particular configuration. Second option is to make our system 
behave like an ideal transparent system as much as possible. 

Let $N$ denote the overall controlled teleoperation system model with the controller $K$, i.e. $N(K)$. Then we can 
define a performance index
\[
\min_K \abs{\pmatr{f_{hum}-F_{r}\\f_{env}-F_{l}\\x_{loc}-x_{\vphantom{l}rem}\\\vdots}} 
\]
for all the admissible signals of time in some suitable normed space that we wish to consider. Alternatively, using a suitable system norm 
the problem becomes
\[
\min_K \abs{N(K) - \pmatr{0&I\\-I&0}}.
\]
Notice that selection of the suitable norm is far from trivial and we might even need an amalgam of different norms to bound both the energy
and the maximum value of the error signals. However, this choice is understandably limited to the tools that we have for analysis and 
synthesis. This choice of enforcing an ideal teleoperation is obviously intuitive and agrees with the underlying physics if only we remove 
the human perception out of the loop. Thus, one could only argue that the global minimizer of these optimization problems would lead to the best 
teleoperation system unless we include the operator's opinion. However, there are two additional implicit assumptions made here. On one hand, 
it is assumed that there is a partial ordering, in other words, if $K_1$ has the cost $c_1$ and $K_2$ has the cost $c_2$ with $c_1<c_2$ 
then this implies that $K_1$ is better than $K_2$ which is not necessarily true, or better, it might hold only for some particular $K$'s. 
In general no such ordering can be expected from this performance index. Moreover, it's not easy to search for such $K$. As we have observed 
in the literature, almost every transparency optimized control method selects the full performance first hence assuming the global minimizer 
and then tries to stabilize the system in the face of those performance specifications. Obviously after adding dampers or other dissipative 
elements the system is no longer a transparent teleoperation system and moreover we don't have a way to measure how much we are off from the 
initial perfectly transparent system since we only know what is ideal. 

On the other hand, we do not have a metric for how much we need to get close to the ideal matrix. Let us first quote three 
very important questions posed by Lawrence in his well-known paper \cite{lawrence};
\begin{displayquote}[{\cite{lawrence}}][.]
In practice, perfectly transparent teleoperation will not be possible. So it makes sense to ask the following questions:
\begin{itemize}
	\item What degree of transparency is necessary to accomplish a given set of teleoperation tasks? 
	\item What degree of transparency is possible?
	\item What are suitable teleoperator architectures and control laws for achieving necessary or optimal transparency?
\end{itemize}
We focus on the second two questions in this paper. Instead of evaluating the performance of a specific teleoperation architecture,
as in [2], we seek to understand the fundamental limits of performance and design trade-offs of bilateral teleoperation in
general, without the constraints of a preconceived architecture
\end{displayquote}
Lawrence then invokes the passivity assumption of the end terminations and hence the passivity theorem is utilized to arrive at structural 
properties of the controller $K$. Evidently, this allows for the back-substitution of the controller entries and solution
for the ideal case. Then the resulting controller is denoted with \enquote{\emph{Transparency Optimized Controller}}. In 
control jargon, this amounts to a cross-coupling control action where the bilateral dynamical differences
are canceled out and then SISO control channels are tuned to maximum performance bound to the stability constraints. 

Even if we accept it to be the distinguishing performance criterion, we have to emphasize that we have not touched the most 
important question, that is the first of the three, rather we hope to achieve the required transparency levels just enough 
to fool the user. After two decades, this point is simply discarded and many studies in the literature somewhat 
treats the conclusions of Lawrence in a different context than what has been given by Lawrence. As is for the case for 
Hogan's paper on passivity, Lawrence never claims that this is a definite performance measure. Instead he clearly shows the 
implications that follow from such assumptions. 

Finally, there are interesting studies inline with our claims about irrelevance of the remote media recreation in bilateral
teleoperation problem. For example, \cite{kilchenman,wildenbeest,boessenkool} and a few other studies report that there is a saturation
effect on how much realism that can be projected to the user. In other words, there is an inherent bandwidth limitation for 
the realism increase such that beyond a certain band of frequency, the transparency does not increase significantly, possibly
unless backed up by tactile feedback. Even further, in the case of shared control applications, it might happen that transparency 
is not needed at all. 


\subsection{\texorpdfstring{$Z$}{Z}-width}

In \cite{colgate4}, the performance of a haptic device is related to the dynamic range of impedances (hence the name $Z$) that
the device can display to the user. In this context we have two extremes; on one hand we have purely the local device impedance 
for the free-air motion and on the other hand we have the maximally stiff local device for the rigid and immobile obstacle collision. 
Let $Z_f$, $Z_c$ denote these two distinct cases. Then the more pronounced the difference between these impedances, the more 
capable the teleoperation system can reflect various impedances inbetween. Thus, we implicitly assume that the rigid contact 
case and the free-air case are the extreme points of the uncertainty set and testing for these two cases are sufficient to 
conclude that any impedance on the path from $Z_f$ to $Z_c$ is a valid impedance that can be displayed by the device. This
in turn implies that there is an ordering in the uncertainty set from \enquote{big} to \enquote{small} etc. and moreover 
the destabilizing uncertainty is at the boundary of the set such that these two extreme cases can vouch for stability over 
the whole possible environments. We are not convinced that this should be the case for all possible environment 
scenarios. A particular subset of second-order mass-spring-damper models of environments can be shown to be compatible 
with this claim if passivity theorem is used. However, when combined with other uncertain blocks in the loop we do not see
how the argument follows. Note that it is well-known in the robust control literature that a destabilizing uncertainty need
not to be living on the boundary of the uncertainty set. Therefore, either by gridding the uncertainty set and testing the 
stability conditions on a large number of points or by a specific relaxation on the constraints, the conditions should be 
translated to finitely many (and computationally tractable) number of points stability is guaranteed over the whole 
uncertainty set.

Similar to what Lawrence has given, the authors also include a clear statement of purpose: 

\begin{displayquote}[{\cite{colgate4}}][.]
This paper will not address the psychophysics of what
makes a virtual wall \enquote{feel good} except to say that one
important factor seems to be dynamic range. An excellent
article on this topic has recently been written by
Rosenberg and Adelstein [11]\footnote{Reference \cite{rosenberg} of this thesis.}. 
We will present instead
some of our findings, both theoretical and experimental,
concerning achievable dynamic range. In short, we will
address the question of how to build a haptic interface
capable of exhibiting a wide range of mechanical
impedances while preserving a robust stability property
\end{displayquote}


Under these assumptions, via defining a functional to measure the distance between $Z_f$ and $Z_c$, we can assess the performance
of different bilateral teleoperation devices. In \cite{goranthesis}, this so-called $Z$-width is defined as 
\begin{equation}
Z_{\text{width}} = \int_{\omega_0}^{\omega_1}{\abs{\log|Z_{\vphantom{f}c}(\iw)|-\log |Z_f(\iw)|}}d\omega
\label{eq:zwidth}
\end{equation}
or alternatively, a simulation/experiment-based method can be utilized as in \cite{weir}. 


Note that \eqref{eq:zwidth} does not appear in the original paper \cite{colgate4} but proposed in \cite{goranthesis,passenberg} 
though we can see neither the reasoning behind this expression nor how it constitutes a comparative quantity. In both 
\cite{colgate4,goranthesis} no additional information is provided except some general rules of thumb about device 
damping and other related issues. 

It should be noted that the differences at each frequency are lumped into one scalar number and moreover, the impedance 
gain curves can cross each other (see \cite{goranthesis}) and might lead to an overly optimistic result. Similarly, 
resonance peaks and zeros of the involved impedances can be smeared out if we solely rely on this functional. 

Since $Z_f$ and $Z_c$ are functions of the environment impedance, these curves can be obtained for one particular environment
at a time. This also holds for the derivation of \cite{lawrence}. In \cite{goranthesis}, the difference  is evaluated for more 
than one environment and then averaged out i.e. let $Z_{act}(Z_e)$ be the impedance displayed to the user in order to render 
$Z_e$ on the local site. Then, for a particular controller, average $Z$-error to each candidate $Z_e$ is given by
\begin{equation}
Z_{avgerr} = \frac{1}{n}\sum_{j=1}^n{\left[
    \frac{1}{\omega_{1j}-\omega_{0j}}\int_{\omega_0}^{\omega_1}{%
                                     \abs{\log|(Z_{act}(Z_{ej}))(\iw)|-\log |Z_{ej}(\iw)|}}d\omega.
                                     \right]}
\label{eq:zdiff}
\end{equation}
This cost function is denoted by \enquote{\emph{Transparency Error}} or \enquote{\emph{Fidelity}}. We refer to \cite{weir} 
for a more detailed discussion. 


\subsection{Fidelity}\label{sec:perf:fidelity}
In \cite{cavusoglu}, a variant of a transparency error is proposed to assess the performance. In this context, 
the emphasis is on the variation of the environment impedance and the resulting effect on the displayed impedance. 
Also the motivation is focused on the surgical procedures via bilateral teleoperation. If, for 
example, the remote device slides over some tissue that involves a tumor or any other irregularity that would be felt
had the same motion performed directly by the surgeon, the better the nuances transmitted, the higher
the fidelity. This performance objective, in a sense, emphasizes the high frequency content of the information (closer
to tactile bandwidth). It has been noted that the Just Noticable Difference (JND) of \SIrange{14}{25}{\percent} for 
distinguishing relative compliance of similar surfaces goes under \SI{1}{\percent} for rapid compliance variation detection 
while, say, scanning a surface (See \cite{dhruvtendick} or \cite{greenishhayward} for a haptics scissor analysis). Similar to the 
definitions given for transparency, the change of the displayed impedance $Z_{disp}(Z_e)$ with respect to the change in the environment 
$Z_e$ can obtained via a straightforward calculation. 

Consider again the system interconnection as depicted in \Cref{fig:nom_net}. Given the scalar complex LTI uncertainty block 
$\Delta$ and the LTI plant $G\in\mathcal{RH}_\infty^{2\times 2}$. The upper LFT interconnection of $\Delta_e-G$ is given by, 
\[
P = G_{11}+G_{12}\Delta_e\inv{(1-G_{22}\Delta_e)}G_{21}
\]
Here $P$ denotes the impedance seen by the operator, $G$ denotes the teleoperation system and $\Delta_e$ being the 
environment impedance. Now, under the well-posedness assumption, define the derivative operation with respect to change in $\Delta_e$
\[
\frac{d}{d\Delta_e} P = G_{12}(1-G_{22}\Delta_e)^{-2}G_{21}
\]
then, though not pursued in \cite{cavusoglu} and left as a complication, this can, in turn, be rewritten as an LFT again;
\[
\pmatr{q_1\\q_2\\z} = \pmatr{G_{22}&&\\&G_{22}&\\&&G_{12}}\pmatr{1&0&1\\1&1&1\\1&1&1}\pmatr{1&&\\&1&\\&&G_{21}} \pmatr{p_1\\p_2\\w}
\]
and 
\[
\pmatr{p_1\\p_2} = \pmatr{\Delta_e &0\\0 &\Delta_e}\pmatr{q_1\\q_2}.
\]
This representation can simply be read from the interconnection depicted below:
\[
\begin{tikzpicture}[>=stealth,baseline]
\node[draw] (g12) at (0,0) {$G_{21}$};
\node[circle,draw,right= 5mm of g12,inner sep=1pt] (sum1) {};
\coordinate[right=2cm of sum1] (j1);
\node[draw,below=2mm of j1] (g22) {$G_{22}$};
\node[draw,below left=2mm and 2mm of g22.south west] (d1) {$\Delta_{e}$};
\draw[->] (j1) -- (g22) |- (d1) node[pos=0.5,below] {$q_1$}-| (sum1) node[pos=0.5,below] {$p_1$};

\node[circle,draw,right= 5mm of j1,inner sep=1pt] (sum2) {};
\coordinate[right=2cm of sum2] (j2);
\node[draw,below=2mm of j2] (g22) {$G_{22}$};
\node[draw,below left=2mm and 2mm of g22.south west] (d2) {$\Delta_{e}$};
\draw[->] (j2) -- (g22) |- (d2) node[pos=0.5,below] {$q_2$}-| (sum2) node[pos=0.5,below] {$p_2$};

\node[draw,right= 7mm of j2] (g21) {$G_{12}$};
\draw (g12) --(sum1) -- (sum2) -- (g21);
\draw[<-] (g12) --++(-1cm,0) node[left]{$w$};
\draw[->] (g21) --++(1cm,0) node[right]{$z$};
\end{tikzpicture}
\]
Note that the matrix case follows a similar but more involved step via computing the Fr\'{e}chet derivative. Moreover one can 
recognize the familiar plant-uncertainty representation clearer without any complication. 

Consequently, the authors define a transparency-like performance objective using a rather subtle choice of system $2$-norms. 
The measure of fidelity is defined as the norm
\[
\norm{\left.W_s\frac{dP}{d\Delta_e}\right|_{\Delta_{enom}}}_2
\]
where $W_s$ is a typically low-pass type weighting function to emphasize the frequency band of interest. Therefore
the synthesis problem is to find the optimizer, controller $K$ to the problem
\[
\sup_{\substack{\text{Stability}\\\text{Other Constraints}}}\inf_{\Delta_{ei}\in\bm{\Delta_e}}
\norm{\left.W_s\frac{dP}{d\Delta}\right|_{\Delta_{ei}}}_2
\]
where $\Delta_{ei}$ are the worst case environments that are of interest.

\begin{define*}[System $2$-norm]Let $H$ be a stable strictly proper LTI system with transfer matrix $H(s)$. Then 
\[
\norm{H}_2^2 \coloneqq \infint{\trace(H^*(\iw)H(\iw))}d\omega
\] 
\end{define*}

There are a few interpretations of this norm in the literature, mainly, the deterministic \enquote{area under the Bode
Plot} interpretation i.e. energy of the impulse response for scalar case, and the stochastic \enquote{steady-state 
white-noise-input response}. We are under the impression that the authors argue in the line of the former interpretation
with a similar reasoning given in the $Z$-width discussion via an area computation. 

Designing a robust controller while minimizing the $\mathcal{H}_2$ norm of an uncertain system in the face of a predefined 
uncertainty set i.e. \enquote{Robust $\mathcal{H}_2$ Synthesis} problem has already recevied a lot of attention and 
the results can be found in the literature, e.g., \cite{dullerud}. Hence, the problem definition in \cite{cavusoglu} is 
in fact tractable. However, it's not clear to us why we choose the system $2$-norm for the performance cost. Additionally, 
the infimum needs to be computed in the face of a set of infinitely many points hence an appropriate relaxation is required. 
This point is also not given though a gridding approach seems to be utilized in the numerical optimization procedure 
described in the paper.

It is also not clear when we should utilize this performance objective. The initial difficulty is that all the 
involved operators are LTI hence there is no time variation involved. The test reads as; \emph{we assume an arbitrary element 
in the predefined uncertainty set, say $\Delta_e$, and evaluate the norm of the weighted derivative at $\Delta_e$ (assuming its 
existence). Hence, in some $\epsilon$-neighborhood of $\Delta_e\in\bm{\Delta_e}$ we can see the change in $P$}. Thus, around this environment 
candidate, this result tells us how much fidelity measure would change. Performing the same check for all elements in the uncertainty
set we find the lowest fidelity value and by the taking supremum over all stabilizing controllers we try to increase this value globally. 
But the environment is still assumed to be LTI hence cannot change. Therefore, for each fixed environment we obtain different structural 
properties of the system. 

Note that, this does not imply that time-variations are taken into account. Suppose a particular admissible trajectory
$\hat{\Delta}_e(t)$ in time is given such that $\hat{\Delta}_e(t_1)=\Delta_e$ i.e. its time-frozen LTI copy coincides 
with the particular nominal environment model $\Delta_e$ and at some time instant $t_2$, it  coincides with another 
LTI model $\mathring\Delta_e$ that is within some $\epsilon$-neighborhood of $\Delta_e$. Even if we achieve very good 
fidelity properties evaluated at each $\Delta_e$ and a sequence of LTI model elements each being in the small neighborhood 
of the other, this does not guarantee that we would have good fidelity for the trajectory $\hat{\Delta}_e$. Actually, 
it might be more desirable to have low fidelity since drastic changes in the performance with respect to LTI uncertainties
might confuse the user.

Same misconception is visible also in the transparency formulations in the literature. Evaluating the LTI expressions in 
nominal environments e.g., $Z_e=0,\infty$ does not imply that the transitions are covered.

\section{Closing Remarks and Discussion}
There are a few other performance criteria reported in the literature. Consider the definition of the impedance seen by 
the operator $P$ above. In \cite{iida,katsura}, this term is divided into two individual terms, denoted by 
\enquote{reproducibility} and \enquote{operationality}. The idea is similar to a sensitivity/complementary sensitivity 
function definitions. 

In \cite{yokokohjiyoshikawa}, also an ideal response is also partitioned into two parts and denoted by \enquote{index of 
maneuverability} and, in essence, is similar to what is given above, hence omitted. 

We refer to the survey papers \cite{hokayemspong,passenberg} for a general treatment and \cite{klomp,dennis} and references 
therein for a more detailed overview about many variations in the literature.


In summary, there are no general perfomance criteria that can lead to a dedicated control design procedure. The aforementioned 
performance objectives always start from the direct manipulation case and assume a distance between the interacting bodies.
Then the implications of such hypotheses are pursued and some results are obtained. It might very well happen that all or none 
of those conclusions are correct. Put better, these studies always try to remedy the distortion caused by the split of two 
interacting bodies i.e. teleoperation. Thus, the goal becomes too ambitious at the outset. Similar to the delay phenomenon, 
there is not much we can do about the distortion within the laws of physics. In fact, even the slightest delay can destabilize 
the system which is again an indicator of the fragility of the problem formulation. 
Thus, we can speculate that by doing so, we create a stability problem that we should not have had in the first place. Moreover, 
as we have mentioned in the introduction, the problem is exclusively about human perception and is not related to the reconstruction 
of the remote scene. As long as we can \enquote{fool} the user for the sake of efficiency and operational comfort, we are done. 

Because we fail to provide an alternative performance criterion, we are forced to use a particular analogy from the audio
technology, to express more clearly what we intend to emphasize. 

When it comes to the faithful reconstruction of the recorded audio, contemporary high-end sound systems offer great fidelity, hence 
the name hi-fi systems\footnote{The term was coined way before the systems become truly hi-fi if compared to today's systems}. There has 
been such a great success that now listeners and component manufacturers are striving for a full audio immersion i.e. listening to 
a recording that feels like actually sitting in the concert hall or venue. However, very similar to the transparency discussion 
in bilateral teleoperation, there is a fundamental obstacle in rendering a live performance sound with the recorded version of it. 
Following quote is from \cite{atkinson}:

\begin{displayquote}
\enquote{What on earth can be the readily identifiable difference}, I wrote in 1995, \enquote{between the sound of a loudspeaker 
producing the live sound of an electric guitar and that same loudspeaker reproducing the recorded sound of an electric guitar?} 
I went on to conjecture that the act of recording inevitably diminishes the dynamic range of the real thing. The in-band phase 
shift from the inevitable cascade of high-pass filters that the signal encounters on its passage from recording microphone to 
playback loudspeakers smears the transients that, live, the listener perceives in all their spiky glory. And as a high-pass filter 
is never encountered with live acoustic music, that's where the essential difference must lie, I concluded, quoting Kalman Rubinson 
(...) that \enquote{Something in Nature abhors a capacitor.}

But two more recent experiences suggest that there must be more to the difference than the presence of unnatural high-pass filters. 
(...)
\end{displayquote}

The author goes on to list the involved hardware, the signal chain, and other relevant details about the mic set-up at both events.
In the first event, a performer plays a piece through the listed hardware and is simultaneously recorded by the author. Then the recorded 
version of the performance is played back to the same audience from the same hardware. In the second event, a nontrivial analog/
digital hybrid device, which supposedly replicates a grand piano via sophisticated mechanisms, used to generate the sound. In both 
events, it has been noted that though the reproduction quality was quite impressive for the audience, a certain liveness was missing.

\begin{displayquote}
So these days, I'm starting to feel that it is something that is never captured by recordings at all that ultimately defines the 
difference between live and recorded sound. (...)[the described systems] succeeded in 
every sonic parameter but one: the intensity of the original sound. Intensity, defined as the sound power per unit area of the 
radiating surface, is the reason why, even if you could equalize a note played on a flute to have the same spectrum as the note 
played on a piano at the same sound pressure level, it will still sound different.

Ultimately, therefore, it is perhaps best to just accept that live music and recorded music are two different phenomena. (...)
Eisenberg's thesis\footnote{See \cite{eisenberg}} is that any attempt to capture the sound of an original event is doomed to 
failure, and that stripping a concert from its cultural context by recording only the audio bestows a sterility on the result 
from which it cannot escape. The recording engineer may be able to pin the butterfly to the disc, but it sure doesn't fly any 
more.(...)

In Eisenberg's words, \enquote{In the great majority of cases, there is no original musical event that a record records or reproduces. 
Instead, each playing of a given record is an instance of something timeless. The original musical event never occurred; it exists, if 
it exists anywhere, outside history.}
\end{displayquote} 


Obviously, these are all subjective opinions rather than rigorous scientific propositions though the first anecdote can be considered as
a user experience study. However, we have to remind that the audio technology is tremendously advanced if compared to haptics and 
teleoperation. In fact, the comparison is not fair in the sense that bilateral teleoperation is not a true technology yet but rather 
in its infancy. Still, after decades of improvements, the sound systems are not capable of producing a live sound, real enough to make 
the listener immerse into, though come impressively close. Not to mention that audio technology is even a unilateral process. Nevertheless, 
the performance objective studies that we have enumerated a few above claim to compete at the level of hi-fi systems which is simply too 
ambitious. The reader should also keep in mind that the sound technology is unilateral and there is no interaction with the loudspeaker 
though still lacking the sufficient realism. 

Coming back to our discussion, in the light of our analogy, we think that the bilateral teleoperation literature is focused on finding
the system that can deliver the \enquote{live sound} rather than a high quality \enquote{playback}. This holistic search is certainly
relevant to the field but it cannot serve as the justification for being a driver of technological advances reported in the literature. Task-%
dependence is already emphasized in many studies as an item of importance and a too general performance criterion would be very unlikely
to serve as a general guideline. Though, we acknowledge the motivation behind the holistic approach and a truly transparent device 
might be the ideal, we also believe that the timing and the feasibility of this approach needs to be modified. The immediate engineering 
problems such as the communication delays and other contemporary technological problems are the strong indicators of the fact that such 
a goal cannot be tackled prematurely. We cannot overemphasize the key issue; even the undelayed case remains unresolved let alone 
(time-varying or constant) delayed case. 


Again from our analogy, it took decades for the hi-fi systems to reach to the current level to claim that
a search for the live sound is justified. The sound reconstruction task is divided into components such as amplifiers, pre-amplifiers, 
direct digital-to-analog converters etc. for the signal conditioning and similarly the sound regeneration is also divided into active-passive 
loudspeakers with having dedicated single or multiple tweeters, sub-woofers etc. Only then the community is convinced that the hardware 
is not the problem\footnote{We also have to state that there is an additional compulsive habit of overemphasizing the component quality such as the transmission cables etc. 
Hence, we can observe a trend among hi-fi enthusiasts of picking up artifacts that are impossible to be audible or simply do not exist.}.
Similarly, TVs and other futuristic vision technologies are following the same trend for the ultimate vision quality. However, if compared 
with these, bilateral teleoperation definitions are nothing but academic stability problem exercises. Moreover, as we show in the next
chapter, these problems are, whether linear or nonlinear, no different than the mainstream control problems in disguise. Therefore, we
can not yet argue about a dedicated stability and a performance problem for bilateral teleoperation. 

This is the reason why we have chosen a significantly advanced methodology, again from mainstream control theory, and applied 
to the bilateral teleoperation problem. This does not imply that we have offered a valid alternative, in fact, quite to the contrary, our goal 
is to make it obvious that the studies so far can be subsumed into a general and widely used methodology and there is no benefit of 
the current specialization of the field. However, we have also shown that if the problem is in fact a special case of a general control problem 
there is no need to use the outdated versions of the techniques proposed in the control literature while important advances are reported 
in the literature in the past two decades over the plain small-gain/passivity theorem-based results. 


A search over the number of studies in the literature published in the 2000's to date with \enquote{bilateral teleoperation} and 
\enquote{delay} as keywords gives hundreds of results. Yet we have no clear understanding of why these devices are unstable. We 
can pinpoint different effects depending on whether we look at it from an energy exchange/passivity point of view or from 
sensitivity function-based analysis. But this does not help us to prioritize certain design aspects of a high-performance system
and unfortunately we have to assume a few questionable hypotheses along the way. It's very difficult to follow the train-of-thought
often given in the literature as we first define the ultimate performance of a teleoperation system then we openly accept the fact that
this is not achievable, however, we, in turn, do not modify our performance criteria and then completely neglect the issue. Finally we 
convert the problem into a stability of some interconnected devices with a great uncertainty associated with them and then, in the majority
of the cases inject damping to the hardware which deteriorates not only the relative performance but usability of the device as a whole.
Specific to the problem at hand, having a stable but poor-performing bilateral teleoperation has less functionality than that of a 
unilateral teleoperation as the added-value of robotic manipulators are wasted with the inclusion of damping. 


After using the audio analogy extensively, let us finish the alternative suggestions with the same analogy. As we have briefly mentioned
the hi-fi loudspeakers involve different dedicated components for different frequency bands such as tweeters for high-frequency band,
sub/woofers for the low frequency bands and occasionally mid-range speakers etc. with individual drivers. Even though different components are 
utilized, the resulting harmony of these components leads to a very satisfactory listening experience if compared to generic single driver
loudspeakers. Now, obviously somatosensory system already involves such sensors which are reported to be responsible for different 
frequency bands (see \Cref{chap:apdxphysio}). Hence, this already gives us a direct cue for separating the motion into different categories 
in terms of the frequency/amplitude content. We do not have a working methodology yet however we would like to include our reasoning here
for comparison. 


There has been quite a number of studies published on combining the tactile feedback with kinesthetic feedback such as \cite{kammermeier,
kimcolgate,kyung,pacchierotti,meli} among many others and references therein. In many of these reports it has been clearly shown that 
combining these modalities lead to substantial increase in human perception about the unstructured environment. Not only the motion but 
also temperature can be transmitted via the tactile thermal actuators. Hence, different overloading of the vibrational patterns can also 
be obtained. We have to underline that the studies mentioned above do not necessarily promote our reasoning for the control design but 
rather superimposing tactile and kinesthetic perception simultaneously. However, there is no apparent obstacle to use the same hardware 
for cooperative kinesthetic profiling. 


Therefore, from a control design perspective this gives rise to a completely different type of performance objective that has no relation 
with the immediate transparency requirements. In other words, the performance of the device is now comprised of the individual excitation 
of the required human receptors. And this is inline with what we have touched in the introduction of this thesis. To the critical reader
this might look like we are shifting the difficulties of the control design to the hardware design since the required devices are indeed
nontrivial. We would argue that it is not the case. The required hardware already exists but used in a different context. We have performed 
preliminary signal processing studies and there is no significant result that can be reported here. Nevertheless, to finalize this discussion,
we can speculate about the links to a plausible solution along these lines.


The immediate possibilities are to separate either the amplitude or the frequency content of the force signal for playback. In the frequency
case, instead of a \enquote{\emph{physics}} matching goal, we instead encode the force signal to be similar to an audio signal. Then, the 
bilateral teleoperation system goal is to playback the measured force pattern with different components simultaneously. This has been pursued for 
pre-recorded signals in \cite{kuchenbecker} within the concept of \enquote{Event-based} haptics. The authors do not use measured 
signals but material contact signals from a contact-bank or look-up table but an increase of perception quality is noted. If we can achieve this separation, 
we have the option to separate the bandwidth of different components and hence making the performance specifications much more 
relaxed if compared to \enquote{one device/all frequencies} strategy. Moreover, the high frequency contact information can be shifted 
to the \enquote{tweeter} of the device and this makes the low frequency force playback much easier. The low-bandwidth transmission is already 
well-studied and within reach of the current technology. Hence, a standalone high frequency action can be added on top of the perception 
which would otherwise lead to a deteriorated performance if pursued with the same device. A hard contact can still be displayed with a 
relatively compliant \enquote{sub/woofer} and an agile stiff tweeter. Note that we are good at capturing the relative changes but not very good at 
perceiving the low end of the spectrum (see \Cref{sec:perf:fidelity}). Moreover, a stiff wall can possibly be rendered up to the perception 
threshold via it's Fourier components in other words, a hard contact perception might be achieved with a combination of mid-high frequency
vibrations contingent upon the task requirements. The force signal can be transmitted and decomposed into frequency band at the local site
or directly decomposed and sent over different channels with different line. This might even encapsulate the Model-mediation-like
fast/slow bus discrimination without the need of recreating a proxy virtual environment update (See e.g. \cite{mitraniemeyer}). Moreover, this 
solution already embodies an inherent robustness to packet losses and similar artifacts as we have no obligation to match the physics any more. 
We conjecture that this would not jeopardize the stability but will make the sensation only deteriorate as we would expect from a noisy telephone 
conversation. As a bonus property, transmitting only the low-frequency device motion, the high-frequency content of the human input is 
filtered without any additional phase lag which can be beneficial for avoiding hand tremor of the surgeon etc. 


It can be argued whether this would lead to a faithful representation of the remote location and we can directly see that the answer is no. 
However, we are trying to remove precisely that requirement and put a device-dependent varying degree of realism instead of working for 
stability and neglecting performance. 


Without any further evidence, there is not much we can extrapolate hence we will leave this discussion to future work. Our main focus is on
incorporating wavelet decomposition in particular with respect to Haar bases for encoding/decoding the contact signals. In what follows
we will instead use a typical force error/position error minimization based control design and show that at least we can achieve good 
robustness properties for a large class of uncertainties due to human/environment dynamics with relatively high performance.