A new algorithm to compute accumulated rewards for Continuous Time Markov Decision Processes with action dependent rewards over finite horizons.
A proof that the algorithm guarantees a global error in O(δ) for time step δ.
Experimental comparision of available algorithms to analyze accumulated rewards for Continuous Time Markov Decision Processes with action dependent rewards over finite horizons.