> module SchedulingExampleVariant2
> import Data.Fin
> import Data.Vect
> import Equality
> import Framework
> import Id
> %default total
> %auto_implicits off
> %access public export
The following example is the deterministic SDP from [Bertsekas1995, Ch.1]
described in Section 2.2 of [1].
For this example we make use of the possible dependencies
for state and control spaces in the BJI-framework.
The implementation leverages on the fact that controls represent
possible actions and states admissible sequences of actions.
We first define a type of actions
> data Act = A | B | C | D
> Eq Act where
> (==) A A = True
> (==) B B = True
> (==) C C = True
> (==) D D = True
> (==) _x _y = False
To capture the order in which actions can be applied, we first define the
dependencies of each action, i.e. |A| and |C| have no dependencies while
|B| and |D| can only be performed when |A| or |C|, respectively, have
already been done.
> dependencies : Act -> List Act
> dependencies A = []
> dependencies B = [A]
> dependencies C = []
> dependencies D = [C]
To check whether all depdencies of a certain action have been fulfilled, we
use a more generic test which checks whether all occurrences from a list
occur in some vector:
> hasAll : {A : Type} -> Eq A => List A -> {n : Nat} -> Vect n A -> Bool
> hasAll [] _l = True
> hasAll _l [] = False
> hasAll (a :: as) l = elem a l && hasAll as l
Whether some vector represents an admissible state is captured by a type family
that evaluates to the one element type |Unit| if the vector is admissible and
to |Void| if not.
> isAdmissible : {t : Nat} -> Vect t Act -> Bool
> isAdmissible [] = True
> isAdmissible {t=S (S (S (S t)))} _ = False
> isAdmissible (a :: as) = hasAll (dependencies a) as &&
> not (elem a as) &&
> isAdmissible as
We can lift Boolean predicates to type valued predicates
> toType : {A : Type} -> (A -> Bool) -> A -> Type
> toType P a = P a = True
and define
> AdmissibleState : {t : Nat} -> Vect t Act -> Type
> AdmissibleState = toType isAdmissible
Similarly we could use the type |So| from the standard library to lift
Boolean predicates. We choose to use lifted Boolean-valued predicates
instead of proof-relevant type-valued predicates as we do not need
computational content, but just a decision whether a state is admissible
or not.
Now a state is simply a pair of a vector of actions and a proof that
it represents an admissible state.
> State : (t : Nat) -> Type
> State t = (as : Vect t Act ** AdmissibleState as)
Admissible controls for a given state are actions that lead to an
admissible next state:
> AdmissibleControl : {t : Nat} -> State t -> Act -> Type
> AdmissibleControl (as ** prf) a = AdmissibleState (a :: as)
And controls are dpendent pairs of an action and a proof that it is
admissible.
> Control : (t : Nat) -> State t -> Type
> Control t x = (a : Act ** AdmissibleControl x a)
We instantiate the framework components accordingly:
> Framework.X t = State t
> Framework.Y t x = Control t x
The SDP is deterministic, thus
< Framework.M = Id
As values we use natural numbers with their familiar structure
> Framework.Val = Nat
> Framework.zero = 0
> Framework.(<+>) = (+)
> Framework.meas = id
The transition function following Fig. 1b can then simply be defined by:
> Framework.next t (as ** _pf) (act ** pf) = MkDPair (act :: as) pf
The reward (or cost) function uses a helper function that defines the setup
costs for certain combinations of actions. Here we omit restricting to the
possible combinations of actions via type dependencies and just use a
dummy value for combinations that will not occur.
> setupCost : Act -> Act -> Nat
> setupCost A B = 2
> setupCost A C = 3
> setupCost A D = 4
> setupCost B C = 3
> setupCost C A = 4
> setupCost C B = 4
> setupCost C D = 6
> setupCost D A = 3
> setupCost _ _ = 100
> Framework.reward Z _x (A ** _pf) _x' = 5
> Framework.reward Z _x (C ** _pf) _x' = 3
> Framework.reward Z ([] ** Refl) (B ** pf) _x' = absurd pf
> Framework.reward Z ([] ** Refl) (D ** pf) _x' = absurd pf
> Framework.reward (S t) x y _x' = setupCost (head (fst x)) (fst y)
Now we can define some example policies.
Note that ideally Idris' totality checker would know the absurd cases by itself
so that they could be omitted. However this is not yet the case - thus we have to
construct contradictions explicitly using |absurd|.
> p0A : Framework.Policy 0
> p0A ([] ** Refl) = MkDPair A Refl
> p0C : Framework.Policy 0
> p0C ([] ** Refl) = MkDPair C Refl
> p1 : Framework.Policy 1
> p1 ([A] ** Refl) = (B ** Refl)
> p1 ([C] ** Refl) = (A ** Refl)
> p1 ([B] ** pf) = absurd pf
> p1 ([D] ** pf) = absurd pf
> p2 : Framework.Policy 2
> p2 ([B, A] ** Refl) = (C ** Refl)
> p2 ([A, C] ** Refl) = (B ** Refl)
> p2 ([C, A] ** Refl) = (B ** Refl)
> p2 ([D, C] ** Refl) = (A ** Refl)
> p2 ([A, A] ** pf) = absurd pf
> p2 ([A, B] ** pf) = absurd pf
> p2 ([A, D] ** pf) = absurd pf
> p2 ([B, B] ** pf) = absurd pf
> p2 ([B, C] ** pf) = absurd pf
> p2 ([B, D] ** pf) = absurd pf
> p2 ([C, B] ** pf) = absurd pf
> p2 ([C, C] ** pf) = absurd pf
> p2 ([C, D] ** pf) = absurd pf
> p2 ([D, A] ** pf) = absurd pf
> p2 ([D, B] ** pf) = absurd pf
> p2 ([D, D] ** pf) = absurd pf
.. and policy sequences
> psA : Framework.PolicySeq 0 3
> psA = (p0A :: (p1 :: (p2 :: Nil)))
> psC : Framework.PolicySeq 0 3
> psC = (p0C :: (p1 :: (p2 :: Nil)))
The initial state is
> initial : State 0
> initial = ([] ** Refl)
And we can compare the behaviour of |val| and |val'| for
our example policy sequences:
> testA : Type
> testA = val psA initial = val' psA initial -- 10 = 10
> testC : Type
> testC = val psC initial = val' psC initial -- 9 = 9