This repository has been archived by the owner on Nov 13, 2017. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathperformance.tex
544 lines (466 loc) · 44.6 KB
/
performance.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
\chapter{Performance Objectives}
\label{chap:perf}
Bilateral teleoperation problem is probably one of the most difficult problems in the control field due to its
subjective nature involving the human comfort and liking. However, to put it bluntly, the experts are not helping
either. In other words, most of the \enquote{good performance} motivations stems from first-principles modeling
but not out of the user experience.
Some literature argue that since most of the tools we utilize are (almost-)lossless, say, a
screwdriver or even a simple stick, it's natural to seek for a passive bilateral teleoperation system (though if we
insist on this analogy, a lossless system should be sought after) that ideally behaves like a rigid transmission mechanism. In the work of
Daniel and McAree \cite{danielmcaree}, it has been strongly recommended that one should focus completely on the underlying physics and in
fact the authors boldly established an essential limit to what can be achieved by bilateral teleoperation:
\begin{displayquote}[{\cite{danielmcaree}}][.]
Although our motivation comes from the problem of building teleoperators to perform tasks of this sort, there is much here that we feel is
common to all teleoperation. The performance of every force-reflecting system is ultimately governed by dynamic interactions between the
master, the slave, the human operator, and the environment. For some applications different effects may dominate; e.g., transmission delays
limit what can be achieved in space applications of telerobotics ([...]). But one can never achieve better performance than that determined
by rudimentary physics. For this reason, we call the limits of performance examined here \emph{fundamental limits of performance}
\end{displayquote}
This statement summarizes perfectly what the contemporary bilateral teleoperation literature promotes. The reason why we strongly
disagree with this statement should be evident at the end of this chapter and later in \Cref{chap:application}. Still, to provide a contextual
introduction to the discussion, we first remind that the underlying physics does not only consist of two distant robots interacting with
their corresponding surroundings. The actual physiscs involve the human liking and that gives us the freedom to display whatever is considered
to be \enquote{cool, nice, crisp, real, helpful} by the human operator. It doesn't matter if we are off by \SI{10}{\newton} or some other reflected
quantity is not in accordance with the actual measurement; as long as the human operator is happy with the result in terms of immersion and
touch sensation, we are done. There is nothing fundamental in terms of technological limits of performance and there is nothing wrong with
approximating (or even altering) the reality to achieve the required technology. In fact in our opinion, this is an instance of the common academic
practice; obfuscation by purism.
In order to justify such fundamental limitations, we have two missing ingredients in \cite{danielmcaree}; the required precise tools that
do not bring in any lossy simplifications and the exact mechanism described completely by the actual physics. Unfortunately, their analysis
rely completely on LTI root loci tools and human preference is openly skipped. Therefore their results can at best be the fundamental
limitations of what the literature claims the bilateral teleoperation is.
In general, there is no established consensus on what makes a teleoperation system \enquote{good}. Quite the contrary,
this question is openly and unambiguously avoided in some well-acknowledged articles. Instead some possibilities are
proposed and rigorously pursued to the end without actually validating if these possibilities reflect our intention.
Therefore, the conclusions that these studies arrive at are the implications of their initial hypotheses. However,
the studies that follow these publications do not take this crucial detail into account and proceed as if these
performance objectives are indeed the ultimate goals. We have to claim that the motivation of most of the studies
given in the literature erroneously put emphasis on performance objectives that are at best suggestions and are often
questionable.
On the other hand, there is a different school that focuses only on stability of the teleoperation system in the
face of human, environment, communication line, quantization and many more uncertainty/perturbation sources.
Especially some nonlinear control studies do not even bother to define performance criteria. This view simply
regards the human/environment as perturbations for our precious robotic systems and neglects the \enquote{\emph{%
reason d'etre}} of the very problem that is under consideration. In our opinion, operator perception is the
indispensable performance objective and can not be overlooked. But it is also very difficult to quantify. More
importantly, it is beyond the scope and expertise of control theory (though with certain overlap) to find the relevant
objectives. Other experts of the related fields need to contribute from a technological point of view in
contrast with a pure physiological point of view and, in fact, should guide the control theorists and practitioners
towards the relevant issues. This lack of performance specifications is the main reason why we have left out a considerable
body of research out in our literature survey.
We strongly believe that the contemporary bilateral teleoperation control results, including this thesis,
cannot and thus should not claim a comprehensive understanding of a good and useful bilateral teleoperation
system. Because we just don't know, yet.
\section{Types of Performance}
In terms of the quality of the force-feedback, there are a few leading choices of methodological performance
definitions. The widely accepted \enquote{transparency} stems from the ideal case of lossless, undistorted
exact replica of the remote side physics at the local site. Hence the ultimate goal is selected to be
faithfully representing the remote site motion and allowing the user intervene just as good as s/he is operating
directly at the remote site. Let us provide some background how transparency is often provided and/or motivated. We
choose to follow \cite{hirchebookchap} for no particular reason other than the manuscript being the introduction/survey
chapter of a recent collection of \emph{Advances in Telerobotics} (but alternatively \cite{dudragne,yokokohjiyoshikawa,hannaford89} and many
others can be followed too); first we again recognize the motivation for modeling via $n$-ports
\begin{displayquote}
For the analysis and control synthesis the modeling of the bilateral teleoperation system as interconnection of two-ports, [...], is
convenient.
\end{displayquote}
Then, the intuitive definition of transparency is provided. We draw the attention to the
transition from the informal description to the technical formulation of transparency.
\begin{displayquote}[][...]
Transparency of the telerobotic system is the major goal in bilateral control
architecture design.
\begin{define*}[Transparency] The telerobotic system is transparent if the human operator feels as if directly interacting with the (remote)
task [10]\footnote{Reference \cite{rajuverg} of this thesis.}
\end{define*}
Formally, transparency is achieved if the transmitted and the environment impedances match [11]\footnote{Reference \cite{lawrence} of this
thesis.} as also indicated above
\[
Z_t=Z_e
\]
or alternatively if HSI (master) and teleoperator (slave) movements are equal and the force displayed to
the human operator is exactly the reaction force from the environment [12]\footnote{Reference \cite{yokokohjiyoshikawa} of this thesis.}
\[
x_h = x_e\text{ and }f_h = f_e
\]
Transparency in this sense is in practice not achievable as the device dynamics comprising inertia and friction cannot completely
cancelled by control. Communication effects, especially time delay, severely degrade the achievable transparency. The development of
quantitative measures is part of the transparency analysis
\end{displayquote}
This transition is practically in all documents which treat transparency as the major goal in the literature. In particular human
\emph{feel} suddenly disappears from the picture and only the physics between the tips of robotic devices are considered. We have no source
in the literature that avoids or at least mentions this interesting omission. Let us first give a detailed discussion of transparency to be
able to clarify what is exactly omitted since there are a few layers of omission in such arguments.
\subsection{Transparency}
Typically, we classify materials as opaque or transparent based on how good we can see through an item made up of that
particular material. If we simply replace the act of seeing the other side with touching the remote location, the
less distorted a system transmits the remote motion to the local site, the more transparent the system is. Hence the
term \emph{transparency}. This is defined in \cite{lawrence,yokokohjiyoshikawa} independently. In fact, the notion of
transparency is of three ideal response definitions in \cite{yokokohjiyoshikawa}. Resuming the notation from
\Cref{chap:litsurvey}, a perfect or ideal transparent 2-port network admits the hybrid matrix
\[
\pmatr{v_{hum}\\f_{hum}}=\pmatr{0&I\\-I&0}\pmatr{f_{env}\\v_{env}}
\]
If we wish to translate this into a control theoretical performance objective, we have essentially two options.
First option is minimizing the difference between the measured force/velocity signals of the actual system and the force/velocity signals
of the hypothetical perfectly transparent system would have exhibited in that particular configuration. Second option is to make our system
behave like an ideal transparent system as much as possible.
Let $N$ denote the overall controlled teleoperation system model with the controller $K$, i.e. $N(K)$. Then we can
define a performance index
\[
\min_K \abs{\pmatr{f_{hum}-F_{r}\\f_{env}-F_{l}\\x_{loc}-x_{\vphantom{l}rem}\\\vdots}}
\]
for all the admissible signals of time in some suitable normed space that we wish to consider. Alternatively, using a suitable system norm
the problem becomes
\[
\min_K \abs{N(K) - \pmatr{0&I\\-I&0}}.
\]
Notice that selection of the suitable norm is far from trivial and we might even need an amalgam of different norms to bound both the energy
and the maximum value of the error signals. However, this choice is understandably limited to the tools that we have for analysis and
synthesis. This choice of enforcing an ideal teleoperation is obviously intuitive and agrees with the underlying physics if only we remove
the human perception out of the loop. Thus, one could only argue that the global minimizer of these optimization problems would lead to the best
teleoperation system unless we include the operator's opinion. However, there are two additional implicit assumptions made here. On one hand,
it is assumed that there is a partial ordering, in other words, if $K_1$ has the cost $c_1$ and $K_2$ has the cost $c_2$ with $c_1<c_2$
then this implies that $K_1$ is better than $K_2$ which is not necessarily true, or better, it might hold only for some particular $K$'s.
In general no such ordering can be expected from this performance index. Moreover, it's not easy to search for such $K$. As we have observed
in the literature, almost every transparency optimized control method selects the full performance first hence assuming the global minimizer
and then tries to stabilize the system in the face of those performance specifications. Obviously after adding dampers or other dissipative
elements the system is no longer a transparent teleoperation system and moreover we don't have a way to measure how much we are off from the
initial perfectly transparent system since we only know what is ideal.
On the other hand, we do not have a metric for how much we need to get close to the ideal matrix. Let us first quote three
very important questions posed by Lawrence in his well-known paper \cite{lawrence};
\begin{displayquote}[{\cite{lawrence}}][.]
In practice, perfectly transparent teleoperation will not be possible. So it makes sense to ask the following questions:
\begin{itemize}
\item What degree of transparency is necessary to accomplish a given set of teleoperation tasks?
\item What degree of transparency is possible?
\item What are suitable teleoperator architectures and control laws for achieving necessary or optimal transparency?
\end{itemize}
We focus on the second two questions in this paper. Instead of evaluating the performance of a specific teleoperation architecture,
as in [2], we seek to understand the fundamental limits of performance and design trade-offs of bilateral teleoperation in
general, without the constraints of a preconceived architecture
\end{displayquote}
Lawrence then invokes the passivity assumption of the end terminations and hence the passivity theorem is utilized to arrive at structural
properties of the controller $K$. Evidently, this allows for the back-substitution of the controller entries and solution
for the ideal case. Then the resulting controller is denoted with \enquote{\emph{Transparency Optimized Controller}}. In
control jargon, this amounts to a cross-coupling control action where the bilateral dynamical differences
are canceled out and then SISO control channels are tuned to maximum performance bound to the stability constraints.
Even if we accept it to be the distinguishing performance criterion, we have to emphasize that we have not touched the most
important question, that is the first of the three, rather we hope to achieve the required transparency levels just enough
to fool the user. After two decades, this point is simply discarded and many studies in the literature somewhat
treats the conclusions of Lawrence in a different context than what has been given by Lawrence. As is for the case for
Hogan's paper on passivity, Lawrence never claims that this is a definite performance measure. Instead he clearly shows the
implications that follow from such assumptions.
Finally, there are interesting studies inline with our claims about irrelevance of the remote media recreation in bilateral
teleoperation problem. For example, \cite{kilchenman,wildenbeest,boessenkool} and a few other studies report that there is a saturation
effect on how much realism that can be projected to the user. In other words, there is an inherent bandwidth limitation for
the realism increase such that beyond a certain band of frequency, the transparency does not increase significantly, possibly
unless backed up by tactile feedback. Even further, in the case of shared control applications, it might happen that transparency
is not needed at all.
\subsection{\texorpdfstring{$Z$}{Z}-width}
In \cite{colgate4}, the performance of a haptic device is related to the dynamic range of impedances (hence the name $Z$) that
the device can display to the user. In this context we have two extremes; on one hand we have purely the local device impedance
for the free-air motion and on the other hand we have the maximally stiff local device for the rigid and immobile obstacle collision.
Let $Z_f$, $Z_c$ denote these two distinct cases. Then the more pronounced the difference between these impedances, the more
capable the teleoperation system can reflect various impedances inbetween. Thus, we implicitly assume that the rigid contact
case and the free-air case are the extreme points of the uncertainty set and testing for these two cases are sufficient to
conclude that any impedance on the path from $Z_f$ to $Z_c$ is a valid impedance that can be displayed by the device. This
in turn implies that there is an ordering in the uncertainty set from \enquote{big} to \enquote{small} etc. and moreover
the destabilizing uncertainty is at the boundary of the set such that these two extreme cases can vouch for stability over
the whole possible environments. We are not convinced that this should be the case for all possible environment
scenarios. A particular subset of second-order mass-spring-damper models of environments can be shown to be compatible
with this claim if passivity theorem is used. However, when combined with other uncertain blocks in the loop we do not see
how the argument follows. Note that it is well-known in the robust control literature that a destabilizing uncertainty need
not to be living on the boundary of the uncertainty set. Therefore, either by gridding the uncertainty set and testing the
stability conditions on a large number of points or by a specific relaxation on the constraints, the conditions should be
translated to finitely many (and computationally tractable) number of points stability is guaranteed over the whole
uncertainty set.
Similar to what Lawrence has given, the authors also include a clear statement of purpose:
\begin{displayquote}[{\cite{colgate4}}][.]
This paper will not address the psychophysics of what
makes a virtual wall \enquote{feel good} except to say that one
important factor seems to be dynamic range. An excellent
article on this topic has recently been written by
Rosenberg and Adelstein [11]\footnote{Reference \cite{rosenberg} of this thesis.}.
We will present instead
some of our findings, both theoretical and experimental,
concerning achievable dynamic range. In short, we will
address the question of how to build a haptic interface
capable of exhibiting a wide range of mechanical
impedances while preserving a robust stability property
\end{displayquote}
Under these assumptions, via defining a functional to measure the distance between $Z_f$ and $Z_c$, we can assess the performance
of different bilateral teleoperation devices. In \cite{goranthesis}, this so-called $Z$-width is defined as
\begin{equation}
Z_{\text{width}} = \int_{\omega_0}^{\omega_1}{\abs{\log|Z_{\vphantom{f}c}(\iw)|-\log |Z_f(\iw)|}}d\omega
\label{eq:zwidth}
\end{equation}
or alternatively, a simulation/experiment-based method can be utilized as in \cite{weir}.
Note that \eqref{eq:zwidth} does not appear in the original paper \cite{colgate4} but proposed in \cite{goranthesis,passenberg}
though we can see neither the reasoning behind this expression nor how it constitutes a comparative quantity. In both
\cite{colgate4,goranthesis} no additional information is provided except some general rules of thumb about device
damping and other related issues.
It should be noted that the differences at each frequency are lumped into one scalar number and moreover, the impedance
gain curves can cross each other (see \cite{goranthesis}) and might lead to an overly optimistic result. Similarly,
resonance peaks and zeros of the involved impedances can be smeared out if we solely rely on this functional.
Since $Z_f$ and $Z_c$ are functions of the environment impedance, these curves can be obtained for one particular environment
at a time. This also holds for the derivation of \cite{lawrence}. In \cite{goranthesis}, the difference is evaluated for more
than one environment and then averaged out i.e. let $Z_{act}(Z_e)$ be the impedance displayed to the user in order to render
$Z_e$ on the local site. Then, for a particular controller, average $Z$-error to each candidate $Z_e$ is given by
\begin{equation}
Z_{avgerr} = \frac{1}{n}\sum_{j=1}^n{\left[
\frac{1}{\omega_{1j}-\omega_{0j}}\int_{\omega_0}^{\omega_1}{%
\abs{\log|(Z_{act}(Z_{ej}))(\iw)|-\log |Z_{ej}(\iw)|}}d\omega.
\right]}
\label{eq:zdiff}
\end{equation}
This cost function is denoted by \enquote{\emph{Transparency Error}} or \enquote{\emph{Fidelity}}. We refer to \cite{weir}
for a more detailed discussion.
\subsection{Fidelity}\label{sec:perf:fidelity}
In \cite{cavusoglu}, a variant of a transparency error is proposed to assess the performance. In this context,
the emphasis is on the variation of the environment impedance and the resulting effect on the displayed impedance.
Also the motivation is focused on the surgical procedures via bilateral teleoperation. If, for
example, the remote device slides over some tissue that involves a tumor or any other irregularity that would be felt
had the same motion performed directly by the surgeon, the better the nuances transmitted, the higher
the fidelity. This performance objective, in a sense, emphasizes the high frequency content of the information (closer
to tactile bandwidth). It has been noted that the Just Noticable Difference (JND) of \SIrange{14}{25}{\percent} for
distinguishing relative compliance of similar surfaces goes under \SI{1}{\percent} for rapid compliance variation detection
while, say, scanning a surface (See \cite{dhruvtendick} or \cite{greenishhayward} for a haptics scissor analysis). Similar to the
definitions given for transparency, the change of the displayed impedance $Z_{disp}(Z_e)$ with respect to the change in the environment
$Z_e$ can obtained via a straightforward calculation.
Consider again the system interconnection as depicted in \Cref{fig:nom_net}. Given the scalar complex LTI uncertainty block
$\Delta$ and the LTI plant $G\in\mathcal{RH}_\infty^{2\times 2}$. The upper LFT interconnection of $\Delta_e-G$ is given by,
\[
P = G_{11}+G_{12}\Delta_e\inv{(1-G_{22}\Delta_e)}G_{21}
\]
Here $P$ denotes the impedance seen by the operator, $G$ denotes the teleoperation system and $\Delta_e$ being the
environment impedance. Now, under the well-posedness assumption, define the derivative operation with respect to change in $\Delta_e$
\[
\frac{d}{d\Delta_e} P = G_{12}(1-G_{22}\Delta_e)^{-2}G_{21}
\]
then, though not pursued in \cite{cavusoglu} and left as a complication, this can, in turn, be rewritten as an LFT again;
\[
\pmatr{q_1\\q_2\\z} = \pmatr{G_{22}&&\\&G_{22}&\\&&G_{12}}\pmatr{1&0&1\\1&1&1\\1&1&1}\pmatr{1&&\\&1&\\&&G_{21}} \pmatr{p_1\\p_2\\w}
\]
and
\[
\pmatr{p_1\\p_2} = \pmatr{\Delta_e &0\\0 &\Delta_e}\pmatr{q_1\\q_2}.
\]
This representation can simply be read from the interconnection depicted below:
\[
\begin{tikzpicture}[>=stealth,baseline]
\node[draw] (g12) at (0,0) {$G_{21}$};
\node[circle,draw,right= 5mm of g12,inner sep=1pt] (sum1) {};
\coordinate[right=2cm of sum1] (j1);
\node[draw,below=2mm of j1] (g22) {$G_{22}$};
\node[draw,below left=2mm and 2mm of g22.south west] (d1) {$\Delta_{e}$};
\draw[->] (j1) -- (g22) |- (d1) node[pos=0.5,below] {$q_1$}-| (sum1) node[pos=0.5,below] {$p_1$};
\node[circle,draw,right= 5mm of j1,inner sep=1pt] (sum2) {};
\coordinate[right=2cm of sum2] (j2);
\node[draw,below=2mm of j2] (g22) {$G_{22}$};
\node[draw,below left=2mm and 2mm of g22.south west] (d2) {$\Delta_{e}$};
\draw[->] (j2) -- (g22) |- (d2) node[pos=0.5,below] {$q_2$}-| (sum2) node[pos=0.5,below] {$p_2$};
\node[draw,right= 7mm of j2] (g21) {$G_{12}$};
\draw (g12) --(sum1) -- (sum2) -- (g21);
\draw[<-] (g12) --++(-1cm,0) node[left]{$w$};
\draw[->] (g21) --++(1cm,0) node[right]{$z$};
\end{tikzpicture}
\]
Note that the matrix case follows a similar but more involved step via computing the Fr\'{e}chet derivative. Moreover one can
recognize the familiar plant-uncertainty representation clearer without any complication.
Consequently, the authors define a transparency-like performance objective using a rather subtle choice of system $2$-norms.
The measure of fidelity is defined as the norm
\[
\norm{\left.W_s\frac{dP}{d\Delta_e}\right|_{\Delta_{enom}}}_2
\]
where $W_s$ is a typically low-pass type weighting function to emphasize the frequency band of interest. Therefore
the synthesis problem is to find the optimizer, controller $K$ to the problem
\[
\sup_{\substack{\text{Stability}\\\text{Other Constraints}}}\inf_{\Delta_{ei}\in\bm{\Delta_e}}
\norm{\left.W_s\frac{dP}{d\Delta}\right|_{\Delta_{ei}}}_2
\]
where $\Delta_{ei}$ are the worst case environments that are of interest.
\begin{define*}[System $2$-norm]Let $H$ be a stable strictly proper LTI system with transfer matrix $H(s)$. Then
\[
\norm{H}_2^2 \coloneqq \infint{\trace(H^*(\iw)H(\iw))}d\omega
\]
\end{define*}
There are a few interpretations of this norm in the literature, mainly, the deterministic \enquote{area under the Bode
Plot} interpretation i.e. energy of the impulse response for scalar case, and the stochastic \enquote{steady-state
white-noise-input response}. We are under the impression that the authors argue in the line of the former interpretation
with a similar reasoning given in the $Z$-width discussion via an area computation.
Designing a robust controller while minimizing the $\mathcal{H}_2$ norm of an uncertain system in the face of a predefined
uncertainty set i.e. \enquote{Robust $\mathcal{H}_2$ Synthesis} problem has already recevied a lot of attention and
the results can be found in the literature, e.g., \cite{dullerud}. Hence, the problem definition in \cite{cavusoglu} is
in fact tractable. However, it's not clear to us why we choose the system $2$-norm for the performance cost. Additionally,
the infimum needs to be computed in the face of a set of infinitely many points hence an appropriate relaxation is required.
This point is also not given though a gridding approach seems to be utilized in the numerical optimization procedure
described in the paper.
It is also not clear when we should utilize this performance objective. The initial difficulty is that all the
involved operators are LTI hence there is no time variation involved. The test reads as; \emph{we assume an arbitrary element
in the predefined uncertainty set, say $\Delta_e$, and evaluate the norm of the weighted derivative at $\Delta_e$ (assuming its
existence). Hence, in some $\epsilon$-neighborhood of $\Delta_e\in\bm{\Delta_e}$ we can see the change in $P$}. Thus, around this environment
candidate, this result tells us how much fidelity measure would change. Performing the same check for all elements in the uncertainty
set we find the lowest fidelity value and by the taking supremum over all stabilizing controllers we try to increase this value globally.
But the environment is still assumed to be LTI hence cannot change. Therefore, for each fixed environment we obtain different structural
properties of the system.
Note that, this does not imply that time-variations are taken into account. Suppose a particular admissible trajectory
$\hat{\Delta}_e(t)$ in time is given such that $\hat{\Delta}_e(t_1)=\Delta_e$ i.e. its time-frozen LTI copy coincides
with the particular nominal environment model $\Delta_e$ and at some time instant $t_2$, it coincides with another
LTI model $\mathring\Delta_e$ that is within some $\epsilon$-neighborhood of $\Delta_e$. Even if we achieve very good
fidelity properties evaluated at each $\Delta_e$ and a sequence of LTI model elements each being in the small neighborhood
of the other, this does not guarantee that we would have good fidelity for the trajectory $\hat{\Delta}_e$. Actually,
it might be more desirable to have low fidelity since drastic changes in the performance with respect to LTI uncertainties
might confuse the user.
Same misconception is visible also in the transparency formulations in the literature. Evaluating the LTI expressions in
nominal environments e.g., $Z_e=0,\infty$ does not imply that the transitions are covered.
\section{Closing Remarks and Discussion}
There are a few other performance criteria reported in the literature. Consider the definition of the impedance seen by
the operator $P$ above. In \cite{iida,katsura}, this term is divided into two individual terms, denoted by
\enquote{reproducibility} and \enquote{operationality}. The idea is similar to a sensitivity/complementary sensitivity
function definitions.
In \cite{yokokohjiyoshikawa}, also an ideal response is also partitioned into two parts and denoted by \enquote{index of
maneuverability} and, in essence, is similar to what is given above, hence omitted.
We refer to the survey papers \cite{hokayemspong,passenberg} for a general treatment and \cite{klomp,dennis} and references
therein for a more detailed overview about many variations in the literature.
In summary, there are no general perfomance criteria that can lead to a dedicated control design procedure. The aforementioned
performance objectives always start from the direct manipulation case and assume a distance between the interacting bodies.
Then the implications of such hypotheses are pursued and some results are obtained. It might very well happen that all or none
of those conclusions are correct. Put better, these studies always try to remedy the distortion caused by the split of two
interacting bodies i.e. teleoperation. Thus, the goal becomes too ambitious at the outset. Similar to the delay phenomenon,
there is not much we can do about the distortion within the laws of physics. In fact, even the slightest delay can destabilize
the system which is again an indicator of the fragility of the problem formulation.
Thus, we can speculate that by doing so, we create a stability problem that we should not have had in the first place. Moreover,
as we have mentioned in the introduction, the problem is exclusively about human perception and is not related to the reconstruction
of the remote scene. As long as we can \enquote{fool} the user for the sake of efficiency and operational comfort, we are done.
Because we fail to provide an alternative performance criterion, we are forced to use a particular analogy from the audio
technology, to express more clearly what we intend to emphasize.
When it comes to the faithful reconstruction of the recorded audio, contemporary high-end sound systems offer great fidelity, hence
the name hi-fi systems\footnote{The term was coined way before the systems become truly hi-fi if compared to today's systems}. There has
been such a great success that now listeners and component manufacturers are striving for a full audio immersion i.e. listening to
a recording that feels like actually sitting in the concert hall or venue. However, very similar to the transparency discussion
in bilateral teleoperation, there is a fundamental obstacle in rendering a live performance sound with the recorded version of it.
Following quote is from \cite{atkinson}:
\begin{displayquote}
\enquote{What on earth can be the readily identifiable difference}, I wrote in 1995, \enquote{between the sound of a loudspeaker
producing the live sound of an electric guitar and that same loudspeaker reproducing the recorded sound of an electric guitar?}
I went on to conjecture that the act of recording inevitably diminishes the dynamic range of the real thing. The in-band phase
shift from the inevitable cascade of high-pass filters that the signal encounters on its passage from recording microphone to
playback loudspeakers smears the transients that, live, the listener perceives in all their spiky glory. And as a high-pass filter
is never encountered with live acoustic music, that's where the essential difference must lie, I concluded, quoting Kalman Rubinson
(...) that \enquote{Something in Nature abhors a capacitor.}
But two more recent experiences suggest that there must be more to the difference than the presence of unnatural high-pass filters.
(...)
\end{displayquote}
The author goes on to list the involved hardware, the signal chain, and other relevant details about the mic set-up at both events.
In the first event, a performer plays a piece through the listed hardware and is simultaneously recorded by the author. Then the recorded
version of the performance is played back to the same audience from the same hardware. In the second event, a nontrivial analog/
digital hybrid device, which supposedly replicates a grand piano via sophisticated mechanisms, used to generate the sound. In both
events, it has been noted that though the reproduction quality was quite impressive for the audience, a certain liveness was missing.
\begin{displayquote}
So these days, I'm starting to feel that it is something that is never captured by recordings at all that ultimately defines the
difference between live and recorded sound. (...)[the described systems] succeeded in
every sonic parameter but one: the intensity of the original sound. Intensity, defined as the sound power per unit area of the
radiating surface, is the reason why, even if you could equalize a note played on a flute to have the same spectrum as the note
played on a piano at the same sound pressure level, it will still sound different.
Ultimately, therefore, it is perhaps best to just accept that live music and recorded music are two different phenomena. (...)
Eisenberg's thesis\footnote{See \cite{eisenberg}} is that any attempt to capture the sound of an original event is doomed to
failure, and that stripping a concert from its cultural context by recording only the audio bestows a sterility on the result
from which it cannot escape. The recording engineer may be able to pin the butterfly to the disc, but it sure doesn't fly any
more.(...)
In Eisenberg's words, \enquote{In the great majority of cases, there is no original musical event that a record records or reproduces.
Instead, each playing of a given record is an instance of something timeless. The original musical event never occurred; it exists, if
it exists anywhere, outside history.}
\end{displayquote}
Obviously, these are all subjective opinions rather than rigorous scientific propositions though the first anecdote can be considered as
a user experience study. However, we have to remind that the audio technology is tremendously advanced if compared to haptics and
teleoperation. In fact, the comparison is not fair in the sense that bilateral teleoperation is not a true technology yet but rather
in its infancy. Still, after decades of improvements, the sound systems are not capable of producing a live sound, real enough to make
the listener immerse into, though come impressively close. Not to mention that audio technology is even a unilateral process. Nevertheless,
the performance objective studies that we have enumerated a few above claim to compete at the level of hi-fi systems which is simply too
ambitious. The reader should also keep in mind that the sound technology is unilateral and there is no interaction with the loudspeaker
though still lacking the sufficient realism.
Coming back to our discussion, in the light of our analogy, we think that the bilateral teleoperation literature is focused on finding
the system that can deliver the \enquote{live sound} rather than a high quality \enquote{playback}. This holistic search is certainly
relevant to the field but it cannot serve as the justification for being a driver of technological advances reported in the literature. Task-%
dependence is already emphasized in many studies as an item of importance and a too general performance criterion would be very unlikely
to serve as a general guideline. Though, we acknowledge the motivation behind the holistic approach and a truly transparent device
might be the ideal, we also believe that the timing and the feasibility of this approach needs to be modified. The immediate engineering
problems such as the communication delays and other contemporary technological problems are the strong indicators of the fact that such
a goal cannot be tackled prematurely. We cannot overemphasize the key issue; even the undelayed case remains unresolved let alone
(time-varying or constant) delayed case.
Again from our analogy, it took decades for the hi-fi systems to reach to the current level to claim that
a search for the live sound is justified. The sound reconstruction task is divided into components such as amplifiers, pre-amplifiers,
direct digital-to-analog converters etc. for the signal conditioning and similarly the sound regeneration is also divided into active-passive
loudspeakers with having dedicated single or multiple tweeters, sub-woofers etc. Only then the community is convinced that the hardware
is not the problem\footnote{We also have to state that there is an additional compulsive habit of overemphasizing the component quality such as the transmission cables etc.
Hence, we can observe a trend among hi-fi enthusiasts of picking up artifacts that are impossible to be audible or simply do not exist.}.
Similarly, TVs and other futuristic vision technologies are following the same trend for the ultimate vision quality. However, if compared
with these, bilateral teleoperation definitions are nothing but academic stability problem exercises. Moreover, as we show in the next
chapter, these problems are, whether linear or nonlinear, no different than the mainstream control problems in disguise. Therefore, we
can not yet argue about a dedicated stability and a performance problem for bilateral teleoperation.
This is the reason why we have chosen a significantly advanced methodology, again from mainstream control theory, and applied
to the bilateral teleoperation problem. This does not imply that we have offered a valid alternative, in fact, quite to the contrary, our goal
is to make it obvious that the studies so far can be subsumed into a general and widely used methodology and there is no benefit of
the current specialization of the field. However, we have also shown that if the problem is in fact a special case of a general control problem
there is no need to use the outdated versions of the techniques proposed in the control literature while important advances are reported
in the literature in the past two decades over the plain small-gain/passivity theorem-based results.
A search over the number of studies in the literature published in the 2000's to date with \enquote{bilateral teleoperation} and
\enquote{delay} as keywords gives hundreds of results. Yet we have no clear understanding of why these devices are unstable. We
can pinpoint different effects depending on whether we look at it from an energy exchange/passivity point of view or from
sensitivity function-based analysis. But this does not help us to prioritize certain design aspects of a high-performance system
and unfortunately we have to assume a few questionable hypotheses along the way. It's very difficult to follow the train-of-thought
often given in the literature as we first define the ultimate performance of a teleoperation system then we openly accept the fact that
this is not achievable, however, we, in turn, do not modify our performance criteria and then completely neglect the issue. Finally we
convert the problem into a stability of some interconnected devices with a great uncertainty associated with them and then, in the majority
of the cases inject damping to the hardware which deteriorates not only the relative performance but usability of the device as a whole.
Specific to the problem at hand, having a stable but poor-performing bilateral teleoperation has less functionality than that of a
unilateral teleoperation as the added-value of robotic manipulators are wasted with the inclusion of damping.
After using the audio analogy extensively, let us finish the alternative suggestions with the same analogy. As we have briefly mentioned
the hi-fi loudspeakers involve different dedicated components for different frequency bands such as tweeters for high-frequency band,
sub/woofers for the low frequency bands and occasionally mid-range speakers etc. with individual drivers. Even though different components are
utilized, the resulting harmony of these components leads to a very satisfactory listening experience if compared to generic single driver
loudspeakers. Now, obviously somatosensory system already involves such sensors which are reported to be responsible for different
frequency bands (see \Cref{chap:apdxphysio}). Hence, this already gives us a direct cue for separating the motion into different categories
in terms of the frequency/amplitude content. We do not have a working methodology yet however we would like to include our reasoning here
for comparison.
There has been quite a number of studies published on combining the tactile feedback with kinesthetic feedback such as \cite{kammermeier,
kimcolgate,kyung,pacchierotti,meli} among many others and references therein. In many of these reports it has been clearly shown that
combining these modalities lead to substantial increase in human perception about the unstructured environment. Not only the motion but
also temperature can be transmitted via the tactile thermal actuators. Hence, different overloading of the vibrational patterns can also
be obtained. We have to underline that the studies mentioned above do not necessarily promote our reasoning for the control design but
rather superimposing tactile and kinesthetic perception simultaneously. However, there is no apparent obstacle to use the same hardware
for cooperative kinesthetic profiling.
Therefore, from a control design perspective this gives rise to a completely different type of performance objective that has no relation
with the immediate transparency requirements. In other words, the performance of the device is now comprised of the individual excitation
of the required human receptors. And this is inline with what we have touched in the introduction of this thesis. To the critical reader
this might look like we are shifting the difficulties of the control design to the hardware design since the required devices are indeed
nontrivial. We would argue that it is not the case. The required hardware already exists but used in a different context. We have performed
preliminary signal processing studies and there is no significant result that can be reported here. Nevertheless, to finalize this discussion,
we can speculate about the links to a plausible solution along these lines.
The immediate possibilities are to separate either the amplitude or the frequency content of the force signal for playback. In the frequency
case, instead of a \enquote{\emph{physics}} matching goal, we instead encode the force signal to be similar to an audio signal. Then, the
bilateral teleoperation system goal is to playback the measured force pattern with different components simultaneously. This has been pursued for
pre-recorded signals in \cite{kuchenbecker} within the concept of \enquote{Event-based} haptics. The authors do not use measured
signals but material contact signals from a contact-bank or look-up table but an increase of perception quality is noted. If we can achieve this separation,
we have the option to separate the bandwidth of different components and hence making the performance specifications much more
relaxed if compared to \enquote{one device/all frequencies} strategy. Moreover, the high frequency contact information can be shifted
to the \enquote{tweeter} of the device and this makes the low frequency force playback much easier. The low-bandwidth transmission is already
well-studied and within reach of the current technology. Hence, a standalone high frequency action can be added on top of the perception
which would otherwise lead to a deteriorated performance if pursued with the same device. A hard contact can still be displayed with a
relatively compliant \enquote{sub/woofer} and an agile stiff tweeter. Note that we are good at capturing the relative changes but not very good at
perceiving the low end of the spectrum (see \Cref{sec:perf:fidelity}). Moreover, a stiff wall can possibly be rendered up to the perception
threshold via it's Fourier components in other words, a hard contact perception might be achieved with a combination of mid-high frequency
vibrations contingent upon the task requirements. The force signal can be transmitted and decomposed into frequency band at the local site
or directly decomposed and sent over different channels with different line. This might even encapsulate the Model-mediation-like
fast/slow bus discrimination without the need of recreating a proxy virtual environment update (See e.g. \cite{mitraniemeyer}). Moreover, this
solution already embodies an inherent robustness to packet losses and similar artifacts as we have no obligation to match the physics any more.
We conjecture that this would not jeopardize the stability but will make the sensation only deteriorate as we would expect from a noisy telephone
conversation. As a bonus property, transmitting only the low-frequency device motion, the high-frequency content of the human input is
filtered without any additional phase lag which can be beneficial for avoiding hand tremor of the surgeon etc.
It can be argued whether this would lead to a faithful representation of the remote location and we can directly see that the answer is no.
However, we are trying to remove precisely that requirement and put a device-dependent varying degree of realism instead of working for
stability and neglecting performance.
Without any further evidence, there is not much we can extrapolate hence we will leave this discussion to future work. Our main focus is on
incorporating wavelet decomposition in particular with respect to Haar bases for encoding/decoding the contact signals. In what follows
we will instead use a typical force error/position error minimization based control design and show that at least we can achieve good
robustness properties for a large class of uncertainties due to human/environment dynamics with relatively high performance.