Skip to content

S1 Cost function

José Gómez-Dans edited this page Aug 8, 2019 · 1 revision

in kaska_sar.py there's a first implementation of the S1 inversion mechanism. This file uses the Water Cloud Model (wcm) and associated cost function which are defined in watercloudmodel.py. We have provided functions for the WCM model, its Jacobian and its Hessian, and similarly, for the cost function, its Jacobian and Hessian. This means that we can exploit efficient Newton methods to minimise the cost.

Solving strategy

The WCM model has for each polarisation and each time step 7 parameters: theta, A, B, C, V1, V2 and sigma_soil. theta is a given, so we don't solve it. The others are all parameters to be solved for. In general, all these 6 parameters will likely be different for VV and VH polarisation, although some simplifications can be made:

  1. Assume V1 and V2 are common to both VV and VH and relate them to e.g. LAI or green LAI.
  2. Assume sigma_soil is common to both VV and VH, and that its variation is accounted for by Cvv and Cvh
  3. Over a time window, we can assume that Axx, Bxx and Cxx (xx={VV, VH}) are constant (or at least, slowly varying).

As a first pass, assume V1 and V2 are given by optical LAI, and that we do not solve for them. Also assume that we will only solve for Axx, Bxx, Cxx and sigma_soil[1..n_obs].

For starting guesses & assuming we've got a first pass LAI estimate from optical...

  • Cxx = Sxx[lai<0.2].mean() Eg Cxx is the average backscatter when the LAI is low
  • Axx = Sxx[lai>(0.8*lai.max())].mean() Eg Axx is the average backscatter when the LAI is high. Actually, we may describe "high" as >3 or something
  • Bxx is a bit hairier to give a starting point to.

After this works, would put prior information on all parameters (including V1 and V2) and solve for all of them.

On the code set up.

The code has been written not for speed, but so we can test it (e.g. running the jacobian and Hessians via autograd). The different cost functions share a lot of common calculations. The obvious way to solve this is to have a class that re-uses the common calculations when the inputs to the cost function change (see here for discussion)

class S1CostFunction(object):
    def __init__(self, svv, svh, V1, V2, theta):
         # blah blah blah
         # Store svv, svh, V1 and V2 in the class
         # These are assumed to be constant, 
         # although in reality we want to also solve 
         # for V1 and V2.
         pass

    def _common_calculations(self, x):
        if not np.array_equal(x, self.x):
            # ... compute common_computations
            self.common_computations = common_computations
            self.x = x
        return self.common_computation

    def cost(self, x):
        common_computations = self._common_calculations(x)
        # Blah Blah Blah
        return cost

    def cost_jac(self, x):
        common_computations = self._common_calculations(x)
        # Blah Blah Blah
        return cost_jac

    def cost_hess(self, x):
        common_computations = self._common_calculations(x)
        # Blah Blah Blah
        return cost_hess

Images have tons of pixels

Since for each pixel we need to solve a minimisation problem, and this incurs a penality of setting things up with most solvers, it might make sense to bunch processing "lots of pixels" together, given jacobian and hessian are available. So this means that one would just stack up the parameters for N pixels and combine the cost functions accordingly.

It is also possible to scatter the optimisations to different workers

It is also possible to solve an equivalent linear problem. Two approaches to this are possible: (i) solve the "update" starting from a first guess and using a Taylor series, or extending the WCM via a MacLaurin series (it's likely that only a handful of terms are required).

Adding prior information

Prior parameter pdf

We can add extra information to the cost function, such as assuming the parameter(s) will be located around some mean value with some standard deviation. This results in a very simple extra additive term to the cost function

def cost_prior(x, mu, inv_cov_mx):
    cost = (x-mu).T@(inv_cov_mx@(x-mu)) # the transpose might not be in the right place ;-)

Serial correlation prior

This is mostly useful for parameters such as V1 and V2 that change slowly over time.

def cost_smooth(x, gamma):
    # dtd is a smoother matrix
    # gamma is more or less the amount of smoothing
    # the x@dtd@x is basically some chosen serial differences squared e.g. (V1[i] - V1[i+1])**2
    cost = gamma*(x@(dtd@x))
    return cost