Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
Nicola Botta
papers
Commits
12723851
Commit
12723851
authored
Feb 23, 2021
by
Nuria Brede
Browse files
Initial.
parent
e0f847b0
Changes
22
Hide whitespace changes
Inline
Side-by-side
2021.On the Correctness of Monadic Backward Induction/Brede,Botta-2021-On the Correctness of Monadic Backward Induction-revision 2021-02.pdf
0 → 100644
View file @
12723851
File added
2021.On the Correctness of Monadic Backward Induction/paper/Appendix.lidr
0 → 100644
View file @
12723851
%if false
> module Appendix
> import Syntax.PreorderReasoning
> import Sigma.Sigma
> %default total
> %auto_implicits off
> %access public export
> infixr 7 :::
> infixr 7 <++>
> %hide map
> %hide pure
> %hide join
> %hide (>>=)
> %hide head
> %hide lteRefl
> %hide Prelude.Algebra.(<+>)
> ExtEq : {A, B : Type} -> (f, g : A -> B) -> Type
> ExtEq {A} f g = (a : A) -> f a = g a
%endif
%if False
\paragraph*{Decision process}. The components of a monadic dynamical system:
> M : Type -> Type
> X : (t : Nat) -> Type
> Y : (t : Nat) -> X t -> Type
> next : (t : Nat) -> (x : X t) -> Y t x -> M (X (S t))
\paragraph*{Decision problem}.
The components of a decision problem:
> Val : Type
> reward : (t : Nat) -> (x : X t) -> Y t x -> X (S t) -> Val
> (<+>) : Val -> Val -> Val
> zero : Val
> meas : M Val -> Val
\paragraph*{Functor and monad specifications}.
|M| is a functor:
> map : {A, B : Type} -> (A -> B) -> M A -> M B
> mapPresComp : {A, B, C : Type} ->
> (f : A -> B) -> (g : B -> C) ->
> map (g . f) `ExtEq` (map g . map f)
|M| is a monad:
> pure : {A : Type} -> A -> M A
> (>>=) : {A, B : Type} -> M A -> (A -> M B) -> M B
> join : {A : Type} -> M (M A) -> M A
|pure| and |join| are natural transformations:
> pureNatTrans : {A, B : Type} -> (f : A -> B) ->
> map f . pure `ExtEq` pure . f
> joinNatTrans : {A, B : Type} -> (f : A -> B) ->
> map f . join `ExtEq` join . map (map f)
|bind|, |join| and |map| need to be inter-related:
> bindJoinSpec : {A, B : Type} -> (f : A -> M B) ->
> (>>= f) `ExtEq` join . map f
\paragraph*{Equality}.
< ExtEq : {A, B : Type} -> (f, g : A -> B) -> Type
< ExtEq {A} f g = (a : A) -> f a = g a
> mapPresExtEq : {A, B : Type} ->
> (f, g : A -> B) -> f `ExtEq` g -> (map f) `ExtEq` (map g)
\paragraph*{Helper functions}.
> (<++>) : {A : Type} -> (f, g : A -> Val) -> A -> Val
> f <++> g = \ a => f a <+> g a
\paragraph*{Policies and policy sequences}.
> Policy : (t : Nat) -> Type
> Policy t = (x : X t) -> Y t x
> data PolicySeq : (t, n : Nat) -> Type where
> Nil : {t : Nat} -> PolicySeq t Z
> (::) : {t, n : Nat} -> Policy t -> PolicySeq (S t) n -> PolicySeq t (S n)
\paragraph*{The value of policy sequences (Bellman equation)}.
> val : {t, n : Nat} -> PolicySeq t n -> X t -> Val
> val {t} Nil x = zero
> val {t} (p :: ps) x = let y = p x in
> let mx' = next t x y in
> (meas . map (reward t x y <++> val ps)) mx'
\section{Preparation of the verification}
\label{appendix:trajectories}
Implementations for subsection \ref{subsection:extension}.
\paragraph*{The measured total reward.}
Definitions for |val'|.
The type of state-control sequences:
> data StateCtrlSeq : (t, n : Nat) -> Type where
> Last : {t : Nat} -> X t -> StateCtrlSeq t (S Z)
> (:::) : {t, n : Nat} -> Sigma (X t) (Y t) ->
> StateCtrlSeq (S t) (S n) -> StateCtrlSeq t (S (S n))
A function that extracts the head state from a state-control sequence:
> head : {t, n : Nat} -> StateCtrlSeq t (S n) -> X t
> head (Last x) = x
> head (MkSigma x y ::: xys) = x
The reward accumulated along a state-control sequence:
> sumR : {t, n : Nat} -> StateCtrlSeq t n -> Val
> sumR {t} (Last x) = zero
> sumR {t} (MkSigma x y ::: xys) = reward t x y (head xys) <+> sumR xys
Computation of an |M|-structure of trajectories from a given policy sequence:
> trj : {t, n : Nat} -> PolicySeq t n -> X t -> M (StateCtrlSeq t (S n))
> trj {t} Nil x = (pure . Last) x
> trj {t} (p :: ps) x = let y = p x in
> let mx' = next t x y in
> map ((MkSigma x y) :::) (mx' >>= trj ps)
The function that first computes the sum of the rewards
along all trajectories resulting from a policy sequence and then applies
the measure to the result (= the \meph{measured total reward}):
> val' : {t, n : Nat} -> PolicySeq t n -> X t -> Val
> val' ps = meas . map sumR . trj ps
%-----------------------------------------------------------------------------%
\section{Further requirements and lemmas}
\label{appendix:lemmas}
%-----------------------------------------------------------------------------%
Implementations of auxiliary results preparing the main theorem
in subsection \ref{subsection:lemmas}.
\paragraph*{Properties of monad algebras}.
Condition for a function |f| to be an |M|-algebra homomorphism:
> algMorSpec : {A, B : Type} ->
> (alpha : M A -> A) -> (beta : M B -> B) -> (f : A -> B) -> Type
> algMorSpec {A} {B} alpha beta f = (beta . map f) `ExtEq` (f . alpha)
Structure maps of |M|-algebras are left inverses of |pure|:
> algPureSpec : {A : Type} -> (alpha : M A -> A) -> Type
> algPureSpec {A} alpha = alpha . pure `ExtEq` id
Structure maps of |M|-algebras are themselves |M|-algebra homomorphisms:
> algJoinSpec : {A : Type} -> (alpha : M A -> A) -> Type
> algJoinSpec {A} alpha = algMorSpec join alpha alpha -- |(alpha . map alpha) `ExtEq` (alpha . join)|
A lemma about computation with |M|-algebras:
> algLemma : {A, B, C : Type} -> (alpha: M C -> C) -> (ee : algJoinSpec alpha) ->
> (f : B -> C ) -> (g : A -> M B) ->
> (alpha . map (alpha . map f . g)) `ExtEq` (alpha . map f . join . map g)
> algLemma {A} {B} {C} alpha ee f g ma =
>
> ( (alpha . map (alpha . map f . g)) ma )
>
> ={ cong (mapPresComp g (alpha . map f) ma) }=
>
> ( (alpha . map (alpha . map f) . map g) ma )
>
> ={ cong (mapPresComp (map f) alpha (map g ma)) }=
>
> ( (alpha . map alpha . map (map f) . map g) ma )
>
> ={ ee (map (map f) (map g ma)) }=
>
> ( (alpha . join . map (map f) . map g) ma )
>
> ={ cong (sym (joinNatTrans f (map g ma))) }=
>
> ( (alpha . map f . join . map g) ma )
>
> QED
\paragraph*{Container monad specification}.
(Partial) specification of \emph{container monads} from (Botta et al. 2017).
> NotEmpty : {A : Type} -> M A -> Type
Condition on |pure|:
> pureNotEmpty : {A : Type} -> (a : A) -> NotEmpty (pure a)
Conditions on |map| and |bind|:
> mapPresNotEmpty : {A, B : Type} -> (f : A -> B) -> (ma : M A) ->
> NotEmpty ma -> NotEmpty (map f ma)
> bindPresNotEmpty : {A, B : Type} -> (f : A -> M B) -> (ma : M A) ->
> NotEmpty ma -> ((a : A) -> NotEmpty (f a)) ->
> NotEmpty (ma >>= f)
\paragraph*{Additional condition on the transition function}.
Condition on |next|:
> nextNotEmpty : {t : Nat} -> (x : X t) -> (y : Y t x) ->
> NotEmpty (next t x y)
\paragraph*{Measure specifications}.
The measure needs to be the structure map of an |M|-algebra on |Val|.
This means it has to fulfill two conditions.
\begin{itemize}
\item It is a left inverse to |pure|:
> measPureSpec : algPureSpec meas -- |meas . pure `ExtEq` id|
\item It is an |M|-algebra homomorphism from |join| to itself:
> measJoinSpec : algJoinSpec meas -- |meas . join `ExtEq` meas . map meas|
\end{itemize}
Moreover, for all |v : Val|, the function |(v <+>)| needs to be an
|M|-algebra homomorphism:
> measPlusSpec : (v : Val) -> (mv : M Val) -> (NotEmpty mv) ->
> (meas . map (v <+>)) mv = ((v <+>) . meas) mv
We can omit the non-emptiness condition but this means we implicitly
restrict the class of monads that can be used for |M|.
The condition could then be expressed as
> measPlusSpec' : (v : Val) -> algMorSpec meas meas (<+> v)
> -- for all |v : Val|, |meas . map (v <+>) `ExtEq` (v <+>) . meas|
%endif
%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%
% Lemmas will be shown with their proofs below the main proof.
%if false
\paragraph*{Lemmas}.
> measAlgLemma : {A, B : Type} -> (f : B -> Val ) -> (g : A -> M B) ->
> meas . map (meas . map f . g) `ExtEq` meas . map f . join . map g
> oplusLiftEERight : {A : Type} -> (f, g, r : A -> Val) -> ExtEq {A} f g -> ExtEq {A} (r <++> f) (r <++> g)
> trjNotEmptyLemma : {t, n : Nat} -> (ps : PolicySeq t n) -> (x : X t) -> NotEmpty (trj ps x)
> headTrjLemma : {t, n : Nat} -> (ps : PolicySeq t n) ->
> (r : X t -> Val) ->
> (s : StateCtrlSeq t (S n) -> Val) -> (x' : X t) ->
> (map (r . head <++> s) . trj ps) x' =
> (map (const (r x') <++> s) . trj ps) x'
> measSumLemma : {t, n : Nat} -> (ps : PolicySeq t n) ->
> (r : X t -> Val) ->
> (s : StateCtrlSeq t (S n) -> Val) ->
> (meas . map (r . head <++> s) . trj ps) `ExtEq`
> (r <++> meas . map s . trj ps)
%endif
%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%
% Visible part of the appendix
%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%
\appendix
\label{section:appendix}
\addcontentsline{toc}{section}{Appendices}
\section*{Appendices}
\section{Preliminaries}
\label{appendix:prelim}
\subsection{General remarks concerning the Idris formalization}
\label{appendix:idrisRemarks}
\begin{itemize}
\item Idris' type checker often struggles with dependencies in implicit
arguments. We sometimes use abbreviations to avoid
cluttering the proofs with implicit arguments.
\item As a standard, we write |(f . g . h) x| instead of |f (g (h
x))|.\\ When this is a problem for the type checker, we use the second
notation.
\item Functions that are not defined explicitly are from the Idris
standard library.\\ Examples are |cong|, |void| and |concat|.
\item Proofs in Idris can be implemented by preorder reasoning.\\
I.e.\ equational reasoning steps of the form
< ( t1 ) ={ step }=
< ( t2 ) QED
as displayed in this appendix are actual type-checkable implementations
of proofs.
\end{itemize}
\subsection{Monad Laws}
\label{appendix:monadLaws}
In the BJI-framework, |M| is required to be a \emph{container monad} but none
of the standard monad laws \citep{bird2014thinking} is required for
the verification result to hold. By contrast, to prove our extended correctness
result, we need |M| to be a full-fledged
monad. More specifically, we require of the monad |M| that
\begin{itemize}
%
\item it is equipped with functor and monad operations:
< map : {A, B : Type} -> (A -> B) -> M A -> M B
<
< pure : {A : Type} -> A -> M A
<
< (>>=) : {A, B : Type} -> M A -> (A -> M B) -> M B
<
< join : {A : Type} -> M (M A) -> M A
\item it preserves extensional equality \citep{botta2020extensional},
identity and composition of arrows:
< mapPresExtEq : {A, B : Type} -> (f, g : A -> B) -> f `ExtEq` g -> map f `ExtEq` map g
<
< mapPresId : {A : Type} -> map id `ExtEq` id
<
< mapPresComp : {A, B, C : Type} -> (f : A -> B) -> (g : B -> C) ->
< map (g . f) `ExtEq` map g . map f
\item Its |pure| and |join| operations are natural transformations \citep[see][I.4]{maclane}:
< pureNatTrans : {A, B : Type} -> (f : A -> B) -> map f . pure `ExtEq` pure . f
<
< joinNatTrans : {A, B : Type} -> (f : A -> B) -> map f . join `ExtEq` join . map (map f)
and fulfill the neutrality and associativity axioms:
< pureNeutralLeft : {A : Type} -> join . pure `ExtEq` id
<
< pureNeutralRight : {A : Type} -> join . map pure `ExtEq` id
<
< joinAssoc : {A : Type} -> join . map join `ExtEq` join . join
\end{itemize}
%
Notice that the above specification of the monad operations is not
minimal: |(>>=)| is not assumed to be implemented in terms of |join|
and |map| (or |map| and |join| via |(>>=)| and |pure|).
Therefore |(>>=)| (pronounced ``|bind|''), |join| and
|map| have to fulfill:
< bindJoinSpec : {A, B : Type} -> (ma : M A) -> (f : A -> M B) -> (ma >>= f) = join (map f ma)
%\pagebreak
The equality in the axioms is extensional equality, not the type
theory's definitional equality:
< (`ExtEq`) : {A, B : Type} -> (f, g : A -> B) -> Type
< (`ExtEq`) {A} f g = (a : A) -> f a = g a
As Idris does not have function extensionality, not postulating
definitional equalities makes the axioms strictly weaker.
The BJI-framework also requires |M| to be equipped with type-level membership,
with a universal quantifier and with a type-valued predicate
< NotEmpty : {A : Type} -> M A -> Type
For our purposes, the monad operations are moreover required to fulfill the
following preservation laws:
\begin{itemize}
\item The |M|-structure obtained from lifting an element into the
monad is not empty:
< pureNotEmpty : {A : Type} -> (a : A) -> NotEmpty (pure a)
\item The monad's |map| and |bind| preserve non-emptiness:
< mapPresNotEmpty : {A, B : Type} -> (f : A -> B) -> (ma : M A) ->
< NotEmpty ma -> NotEmpty (map f ma)
< bindPresNotEmpty : {A, B : Type} -> (f : A -> M B) -> (ma : M A) ->
< NotEmpty ma -> ((a : A) -> NotEmpty (f a)) -> NotEmpty (ma >>= f)
\end{itemize}
As discussed in section~\ref{subsection:abstractCond}, we could have
omitted these non-emptiness preservation laws, but instead would have
implicitly restricted the class of monads for which the correctness
result holds.
\subsection{Preservation of extensional equality}
\label{appendix:presEE}
We have stated above that for our correctness proof the functor |M|
has to satisfy the monad laws and moreover its functorial |map| has to
preserve extensional equality.
This e.g.\ is the case for |M = Identity| (no uncertainty), |M = List|
(non-deterministic uncertainty) and for the finite probability
distributions (stochastic uncertainty, |M = Prob|) discussed in
\citep{2017_Botta_Jansson_Ionescu}. For |M = List| the
proof of |mapPresExtEq| amounts to:
< mapPresExtEq : {A, B : Type} -> (f, g : A -> B) -> f `ExtEq` g -> map f `ExtEq` map g
<
< mapPresExtEq f g p Nil = Refl
< mapPresExtEq f g p ( a :: as) =
< ( map f (a :: as) ) ={ Refl }=
< ( f a :: map f as ) ={ cong {f = \ x => x :: map f as} (p a) }=
< ( g a :: map f as ) ={ cong (mapPresExtEq f g p as) }=
< ( g a :: map g as ) ={ Refl }=
< ( map g (a :: as) ) QED
The principle of extensional equality preservation is discussed in
more detail in \citep{botta2020extensional}.
%-----------------------------------------------------------------------------%
\section{Correctness of the value function}
\label{appendix:theorem}
%-----------------------------------------------------------------------------%
%\begin{small}
> valMeasTotalReward : {t, n : Nat} -> (ps : PolicySeq t n) -> (x : X t) ->
> val' ps x = val ps x
> valMeasTotalReward Nil x =
>
> ( val' Nil x ) ={ Refl }=
>
> ( meas (map sumR (trj Nil x)) ) ={ Refl }=
>
> ( meas (map sumR (pure (Last x))) ) ={ cong (pureNatTrans sumR (Last x)) }=
>
> ( meas (pure (sumR (Last x))) ) ={ Refl }=
>
> ( meas (pure zero) ) ={ measPureSpec zero }=
>
> ( zero ) ={ Refl }=
>
> ( val Nil x ) QED
> valMeasTotalReward {t} {n = S m} (p :: ps) x =
>
> -- type abbreviations
> let SCS : Type = StateCtrlSeq (S t) (S m) in
> let SCS' : Type = StateCtrlSeq t (S (S m)) in
> -- element and function abbreviations
> let y : Y t x = p x in
> let mx' : M (X (S t)) = next t x y in
> let r : (X (S t) -> Val) = reward t x y in
> let trjps : (X (S t) -> M SCS) = trj ps in
> let consxy : (SCS -> SCS') = ((MkSigma x y) :::) in
> let mx'trjps : M SCS = (mx' >>= trjps) in
> let sR : (SCS -> Val) = sumR {t=S t} {n=S m} in
> let hd : (SCS -> X (S t)) = head {t=S t} {n=m} in
> -- proof steps:
> let useMapPresComp = mapPresComp consxy sumR mx'trjps in
> let useBindJoinSpec = bindJoinSpec {B=SCS} trjps mx' in
> let useAlgLemma = algLemma {B=SCS}
> meas measJoinSpec
> ((r . hd) <++> sumR) trjps mx' in
> let useMeasSumLemma = mapPresExtEq
> (meas . map (r . hd <++> sR) . trjps)
> (r <++> meas . map sR . trjps)
> (measSumLemma ps r sR)
> mx' in
> let useIH = mapPresExtEq
> (r <++> val' ps)
> (r <++> val ps)
> (oplusLiftEERight (val' ps)
> (val ps) r (valMeasTotalReward ps))
> mx' in
> let ctx = \a => meas (map ((r . hd) <++> sumR) a) in
>
> ( val' (p :: ps) x ) ={ Refl }=
>
> ( meas (map sumR (trj (p :: ps) x)) ) ={ Refl }=
>
> ( meas (map sumR (map consxy mx'trjps)) ) ={ cong (sym useMapPresComp) }=
>
> ( meas (map (sumR . consxy) mx'trjps) ) ={ Refl }=
>
> ( meas (map ((r . hd) <++> sR) mx'trjps) ) ={ cong {f=ctx} useBindJoinSpec }=
>
> ( meas (map ((r . hd) <++> sR) (join (map trjps mx'))) ) ={ sym useAlgLemma }=
>
> ( meas (map (meas . map (r . hd <++> sR) . trjps) mx') ) ={ cong useMeasSumLemma }=
>
> ( meas (map (r <++> meas . map sR . trjps) mx') ) ={ Refl }=
>
> ( meas (map (r <++> val' ps) mx') ) ={ cong useIH }=
>
> ( meas (map (r <++> val ps) mx') ) ={ Refl }=
>
> ( val (p :: ps) x ) QED
%\end{small}
%-----------------------------------------------------------------------------%
\section{Correctness of monadic backward induction}
\label{appendix:biCorrectness}
%-----------------------------------------------------------------------------%
%if false
> (<=) : Val -> Val -> Type
> lteRefl : {a : Val} -> a <= a
> lteTrans : {a, b, c : Val} -> a <= b -> b <= c -> a <= c
> plusMonSpec : {a, b, c, d : Val} -> a <= b -> c <= d -> (a <+> c) <= (b <+> d)
> measMonSpec : {A : Type} -> (f, g : A -> Val) -> ((a : A) -> (f a) <= (g a)) ->
> (ma : M A) -> meas (map f ma) <= meas (map g ma)
Optimality of policy sequences:
> OptPolicySeq : {t, n : Nat} -> PolicySeq t n -> Type
> OptPolicySeq {t} {n} ps = (ps' : PolicySeq t n) -> (x : X t) -> val ps' x <= val ps x
Optimality of extensions of policy sequences:
> OptExt : {t, n : Nat} -> PolicySeq (S t) n -> Policy t -> Type
> OptExt {t} ps p = (p' : Policy t) -> (x : X t) -> val (p' :: ps) x <= val (p :: ps) x