Jekyll2022-05-25T12:24:50+00:00https://avt.im/feed.xmlAlexander TereninAlexander TereninPhysically Structured Neural Networks for Smooth and Contact Dynamics2022-05-25T00:00:00+00:002022-05-25T00:00:00+00:00https://avt.im/talks/2022/05/25/Physically-Structured-Networks<p>A neural network’s architecture encodes key information and inductive biases that are used to guide its predictions. In this talk, we discuss recent work which leverages the perspective of neural ordinary differential equations to design network architectures that encode the structures of classical mechanics. We examine the cases of both smooth dynamics and non-smooth contact dynamics. The architectures obtained are easy to understand, show excellent performance and data-efficiency on simple benchmark tasks, and are a promising emerging tool for use in robot learning and related areas.</p>
<p>Alexander Terenin is a Postdoctoral Research Associate at the University of Cambridge. He is interested in statistical machine learning, particularly in settings where the data is not fixed, but is gathered interactively by the learning machine. This leads naturally to Gaussian processes and data-efficient interactive decision-making systems such as Bayesian optimization, to areas such as multi-armed bandits and reinforcement learning, and to techniques for incorporating inductive biases and prior information such as symmetries into machine learning models.</p>Alexander TereninA neural network’s architecture encodes key information and inductive biases that are used to guide its predictions. In this talk, we discuss recent work which leverages the perspective of neural ordinary differential equations to design network architectures that encode the structures of classical mechanics. We examine the cases of both smooth dynamics and non-smooth contact dynamics. The architectures obtained are easy to understand, show excellent performance and data-efficiency on simple benchmark tasks, and are a promising emerging tool for use in robot learning and related areas.Non-Euclidean Matérn Gaussian Processes2022-05-06T00:00:00+00:002022-05-06T00:00:00+00:00https://avt.im/talks/2022/05/06/Non-Euclidean-Matern-GP<p>In recent years, the machine learning community has become increasingly interested in learning in settings where data lives in non-Euclidean spaces, for instance in applications to physics and engineering, or other settings where it is important that symmetries are enforced.
In this talk, we will develop a class of Gaussian process models defined on Riemannian manifolds and graphs, and show how to effectively perform all computations needed to train these models using standard automatic-differentiation-based methods.
This gives an effective framework to deploy data-efficient interactive decision-making systems such as Bayesian optimization to settings with symmetries and invariances.</p>
<p>Alexander Terenin is a Postdoctoral Research Associate at the University of Cambridge. He is interested in statistical machine learning, particularly in settings where the data is not fixed, but is gathered interactively by the learning machine. This leads naturally to Gaussian processes and data-efficient interactive decision-making systems such as Bayesian optimization, to areas such as multi-armed bandits and reinforcement learning, and to techniques for incorporating inductive biases and prior information such as symmetries into machine learning models.</p>Alexander TereninIn recent years, the machine learning community has become increasingly interested in learning in settings where data lives in non-Euclidean spaces, for instance in applications to physics and engineering, or other settings where it is important that symmetries are enforced. In this talk, we will develop a class of Gaussian process models defined on Riemannian manifolds and graphs, and show how to effectively perform all computations needed to train these models using standard automatic-differentiation-based methods. This gives an effective framework to deploy data-efficient interactive decision-making systems such as Bayesian optimization to settings with symmetries and invariances.Matérn Gaussian Processes on Riemannian Manifolds2022-03-25T00:00:00+00:002022-03-25T00:00:00+00:00https://avt.im/talks/2022/03/25/Riemannian-Matern-GP<p>Gaussian processes are an effective model class for learning unknown functions, particularly in settings where accurately representing predictive uncertainty is of key importance. Motivated by applications in the physical sciences, the widely-used Matérn class of Gaussian processes has recently been generalized to model functions whose domains are Riemannian manifolds, by re-expressing said processes as solutions of stochastic partial differential equations. In this work, we propose techniques for computing the kernels of these processes on compact Riemannian manifolds via spectral theory of the Laplace–Beltrami operator in a fully constructive manner, thereby allowing them to be trained via standard scalable techniques such as inducing point methods. We also extend the generalization from the Matérn to the widely-used squared exponential Gaussian process. By allowing Riemannian Matérn Gaussian processes to be trained using well-understood techniques, our work enables their use in mini-batch, online, and non-conjugate settings, and makes them more accessible to machine learning practitioners.</p>Alexander TereninGaussian processes are an effective model class for learning unknown functions, particularly in settings where accurately representing predictive uncertainty is of key importance. Motivated by applications in the physical sciences, the widely-used Matérn class of Gaussian processes has recently been generalized to model functions whose domains are Riemannian manifolds, by re-expressing said processes as solutions of stochastic partial differential equations. In this work, we propose techniques for computing the kernels of these processes on compact Riemannian manifolds via spectral theory of the Laplace–Beltrami operator in a fully constructive manner, thereby allowing them to be trained via standard scalable techniques such as inducing point methods. We also extend the generalization from the Matérn to the widely-used squared exponential Gaussian process. By allowing Riemannian Matérn Gaussian processes to be trained using well-understood techniques, our work enables their use in mini-batch, online, and non-conjugate settings, and makes them more accessible to machine learning practitioners.Gaussian Processes and Statistical Decision-making in Non-Euclidean spaces2022-02-21T00:00:00+00:002022-02-21T00:00:00+00:00https://avt.im/publications/2022/02/21/PhD-Thesis<p>Bayesian learning using Gaussian processes provides a foundational framework for making decisions in a manner that balances what is known with what could be learned by gathering data.
In this dissertation, we develop techniques for broadening the applicability of Gaussian processes.
This is done in two ways.
Firstly, we develop pathwise conditioning techniques for Gaussian processes, which allow one to express posterior random functions as prior random functions plus a dependent update term.
We introduce a wide class of efficient approximations built from this viewpoint, which can be randomly sampled once in advance, and evaluated at arbitrary locations without any subsequent stochasticity.
This key property improves efficiency and makes it simpler to deploy Gaussian process models in decision-making settings.
Secondly, we develop a collection of Gaussian process models over non-Euclidean spaces, including Riemannian manifolds and graphs.
We derive fully constructive expressions for the covariance kernels of scalar-valued Gaussian processes on Riemannian manifolds and graphs.
Building on these ideas, we describe a formalism for defining vector-valued Gaussian processes on Riemannian manifolds.
The introduced techniques allow all of these models to be trained using standard computational methods.
In total, these contributions make Gaussian processes easier to work with and allow them to be used within a wider class of domains in an effective and principled manner.
This, in turn, makes it possible to potentially apply Gaussian processes to novel decision-making settings.</p>Alexander TereninBayesian learning using Gaussian processes provides a foundational framework for making decisions in a manner that balances what is known with what could be learned by gathering data. In this dissertation, we develop techniques for broadening the applicability of Gaussian processes. This is done in two ways. Firstly, we develop pathwise conditioning techniques for Gaussian processes, which allow one to express posterior random functions as prior random functions plus a dependent update term. We introduce a wide class of efficient approximations built from this viewpoint, which can be randomly sampled once in advance, and evaluated at arbitrary locations without any subsequent stochasticity. This key property improves efficiency and makes it simpler to deploy Gaussian process models in decision-making settings. Secondly, we develop a collection of Gaussian process models over non-Euclidean spaces, including Riemannian manifolds and graphs. We derive fully constructive expressions for the covariance kernels of scalar-valued Gaussian processes on Riemannian manifolds and graphs. Building on these ideas, we describe a formalism for defining vector-valued Gaussian processes on Riemannian manifolds. The introduced techniques allow all of these models to be trained using standard computational methods. In total, these contributions make Gaussian processes easier to work with and allow them to be used within a wider class of domains in an effective and principled manner. This, in turn, makes it possible to potentially apply Gaussian processes to novel decision-making settings.Gaussian Processes and Statistical Decision-making in Non-Euclidean Spaces2021-11-15T00:00:00+00:002021-11-15T00:00:00+00:00https://avt.im/talks/2021/11/15/PhD-Viva<p>Bayesian learning using Gaussian processes provides a foundational framework for making decisions in a manner that balances what is known with what could be learned by gathering data.
In this dissertation, we develop techniques for broadening the applicability of Gaussian processes.
This is done in two ways.
Firstly, we develop pathwise conditioning techniques for Gaussian processes, which allow one to express posterior random functions as prior random functions plus a dependent update term.
We introduce a wide class of efficient approximations built from this viewpoint, which can be sampled once in advance, and evaluated at arbitrary locations without any subsequent stochasticity.
These improves efficiency and makes it simpler to deploy Gaussian process models in decision-making settings.
Secondly, we develop a collection of Gaussian process models over non-Euclidean spaces, including Riemannian manifolds and graphs.
We derive fully constructive expressions for the covariance kernels of scalar-valued Gaussian processes on Riemannian manifolds and graphs.
Building on these ideas, we describe a formalism for defining vector-valued Gaussian processes on Riemannian manifolds.
In total, the introduced techniques allow all of these models to be trained using standard computational methods.
In total, these contributions make Gaussian processes easier to work with and allows them to be used within a wider class of domains in an effective and principled manner.
This, in turn, makes it possible to potentially apply Gaussian processes to novel decision-making settings.</p>Alexander TereninBayesian learning using Gaussian processes provides a foundational framework for making decisions in a manner that balances what is known with what could be learned by gathering data. In this dissertation, we develop techniques for broadening the applicability of Gaussian processes. This is done in two ways. Firstly, we develop pathwise conditioning techniques for Gaussian processes, which allow one to express posterior random functions as prior random functions plus a dependent update term. We introduce a wide class of efficient approximations built from this viewpoint, which can be sampled once in advance, and evaluated at arbitrary locations without any subsequent stochasticity. These improves efficiency and makes it simpler to deploy Gaussian process models in decision-making settings. Secondly, we develop a collection of Gaussian process models over non-Euclidean spaces, including Riemannian manifolds and graphs. We derive fully constructive expressions for the covariance kernels of scalar-valued Gaussian processes on Riemannian manifolds and graphs. Building on these ideas, we describe a formalism for defining vector-valued Gaussian processes on Riemannian manifolds. In total, the introduced techniques allow all of these models to be trained using standard computational methods. In total, these contributions make Gaussian processes easier to work with and allows them to be used within a wider class of domains in an effective and principled manner. This, in turn, makes it possible to potentially apply Gaussian processes to novel decision-making settings.Vector-valued Gaussian Processes on Riemannian Manifolds via Gauge Independent Projected Kernels2021-09-28T00:00:00+00:002021-09-28T00:00:00+00:00https://avt.im/publications/2021/09/28/Vector-Field-GP<p>Gaussian processes are machine learning models capable of learning unknown functions with uncertainty. Motivated by a desire to deploy Gaussian processes in novel areas of science, we present a new class of Gaussian processes that model random vector fields on Riemannian manifolds that is (1) mathematically sound, (2) constructive enough for use by machine learning practitioners and (3) trainable using standard methods such as inducing points. In this post, we summarize the paper and illustrate the main results and ideas.</p>
<h1 id="vector-fields-on-manifolds">Vector fields on manifolds</h1>
<p>Before discussing Gaussian processes, we first review vector fields on manifolds.
Let \(X\) be a <em>manifold</em>—a smooth geometric space where the rules of calculus apply.
For each \(x \in X\), let \(T_x X\) be the <em>tangent space</em> at \(x\), which is a vector space intuitively representing all the directions one can move on the manifold from that point.
The <em>tangent bundle</em> \(TX\) is defined by gluing together all tangent spaces—this space is also a manifold.
Let \(\operatorname{proj}_X : TX \to X\) be the projection map, which takes vectors in the tangent space, and maps them back to the underlying points they are attached to.
A <em>vector field</em> is a function \(f: X \to TX\) satisfying the <em>section</em> property \(f \circ \operatorname{proj}_X = \operatorname{id}_X\), meaning that the arrow \(f(x) \in TX\) must be attached to the point \(x\).
We denote the space of vector fields by \(\Gamma(TX)\).</p>
<p>Vector fields reflect the topological properties of the manifolds they are defined on.
For example, by the Poincaré–Hopf Theorem, there does not exist a smooth non-vanishing vector field on the sphere.
This result is also known as the <em>hairy ball</em> theorem, because if we imagine a ball with hair attached to it, the result says we cannot comb the hair, making it tangential to the sphere, without creating a discontinuous cowlick.
Note that, unlike in the Euclidean setting, this implies that a smooth vector field generally <em>cannot</em> be written as a continuous function \(f : X \to \mathbb{R}^d\), and we must work with the machinery of tangent bundles to make sense of vector fields.</p>
<div class="row align-items-center justify-content-center">
<div class="col-12 col-md-6 col-lg-5 text-center">
<img class="img-fluid" alt="Torus" src="/assets/publications/2021-09-28-Vector-Field-GP/torus.png" />
</div>
<div class="col-12 col-md-6 col-lg-5 text-center">
<img class="img-fluid" alt="Klein bottle" src="/assets/publications/2021-09-28-Vector-Field-GP/klein_bottle.png" />
</div>
<div class="col-12">
<p class="text-center">
Examples of Gaussian random vector fields on the torus (left) and the Klein bottle (right).
</p>
</div>
</div>
<h1 id="gaussian-vector-fields">Gaussian vector fields</h1>
<p>To define Gaussian processes which are random vector fields, the first issue we must address is that a Gaussian process, classically, is a vector-valued random function \(f : X \to \mathbb{R}^d\)<sup id="fnref:rna" role="doc-noteref"><a href="#fn:rna" class="footnote" rel="footnote">1</a></sup> which is, for any finite collection of points, Gaussian-distributed.
However, a well-defined vector field is instead a random function \(f : X \to TX\)<sup id="fnref:rna:1" role="doc-noteref"><a href="#fn:rna" class="footnote" rel="footnote">1</a></sup> satisfying the section property, and the range of this function is a manifold rather than a vector space, so it is not immediately clear in what sense such a function could be Gaussian.
Therefore, the first step is to say what we actually mean by the term Gaussian in this setting.</p>
<p><strong>Definition:</strong> A random vector field \(f \in \Gamma(TX)\)<sup id="fnref:rna:2" role="doc-noteref"><a href="#fn:rna" class="footnote" rel="footnote">1</a></sup> is <em>Gaussian</em> if for any points \(x_1, \ldots, x_n \in X\) on the manifold, the vectors \(f(x_1),..,f(x_n) \in T_{x_1} X \oplus .. \oplus T_{x_n} X\) attached to it are jointly Gaussian.</p>
<p>Here, \(\oplus\) is the direct sum of vector spaces.
With this definition in place, our next step is to show that standard properties of Gaussian processes carry over to this setting.
In particular, we would like to characterize Gaussian vector fields in terms of a mean function and a covariance kernel.
The former notion is clear: the mean of a Gaussian vector field should just be an ordinary vector field that will determine the mean vector at all finite-dimensional marginals.
On the other hand, generalizing matrix-valued kernels is less obvious, as it is not clear what the appropriate notion of a matrix should be in the geometric setting.</p>
<h1 id="the-covariance-kernel-of-a-gaussian-vector-field">The covariance kernel of a Gaussian vector field</h1>
<p>To generalize the notion of a matrix-valued kernel to the geometric setting, we introduce the following definition.</p>
<p><strong>Definition.</strong> We say that a scalar-valued function \(k : T^*X \times T^*X \to \mathbb{R}\) is a <em>cross-covariance kernel</em> if it satisfies the following key properties.</p>
<ol>
<li>Symmetry: for all \(\alpha, \beta \in T^*X\), \(k(\alpha, \beta) = k(\beta, \alpha)\) holds.</li>
<li>Fiberwise bilinearity: for any pairs of points \(x, x' \in X\), \(k(\lambda \alpha_x + \mu \beta_x, \gamma_{x'}) = \lambda k(\alpha_x, \gamma_{x'}) + \mu k(\beta_x, \gamma_{x'})\) holds for any \(\alpha_x, \beta_x \in T^*_x X\), \(\gamma_{x'} \in T^*_{x'} X\) and \(\lambda, \mu \in \mathbb{R}\).</li>
<li>Positive definiteness: for any \(\alpha_1, .., \alpha_n \in T^*X\), we have \(\sum_{i=1}^n\sum_{j=1}^n k(\alpha_i, \alpha_j) \geq 0\).</li>
</ol>
<p>Here, \(T^* X\) is the <em>cotangent bundle</em>, which is constructed similarly to the tangent bundle, but by gluing together the dual of the tangent spaces \((T_x X)^*\) instead of the tangent spaces.
Why is this definition precisely the notion we need?
In this work, we prove that cross-covariance kernels in the above sense are exactly analogous to Euclidean matrix-valued kernels.</p>
<p><strong>Theorem.</strong> Every Gaussian random vector field admits and is uniquely determined by a mean vector field and a cross-covariance kernel.</p>
<h1 id="projected-kernels">Projected kernels</h1>
<p>The preceding ideas tell us what a Gaussian vector field is, but say little about how to implement one numerically.
To proceed towards this, we rely on an extrinsic geometric approach we call the <em>projected kernel</em> construction.
This is detailed as follows.</p>
<ol>
<li>
<p>Embed the manifold isometrically into a higher-dimensional Euclidean space \(\mathbb{R}^{d'}\).<sup id="fnref:ne" role="doc-noteref"><a href="#fn:ne" class="footnote" rel="footnote">2</a></sup></p>
</li>
<li>
<p>Construct a vector-valued Gaussian process \(\boldsymbol{f} : X \rightarrow \mathbb{R}^{d'}\) in the usual sense with a matrix-valued kernel \(\boldsymbol{\kappa} : X \times X \rightarrow \mathbb{R}^{d'} \times \mathbb{R}^{d'}\).</p>
</li>
<li>
<p>Project the vectors of the resulting function so that they become tangential to the manifold, giving a vector field.</p>
</li>
</ol>
<p>In this work, we show that (1) this procedure defines a cross-covariance kernel, and (2) all cross-covariance kernels arise this way and therefore no expressivity is lost by employing this construction.
Thus, once we have a matrix-valued kernel \(\boldsymbol{\kappa} : X \times X \rightarrow \mathbb{R}^{d'} \times \mathbb{R}^{d'}\) taking values in the higher dimensional Euclidean space, we obtain completely general workable kernels for Gaussian random vector fields.
Constructing such matrix-valued kernels, in turn, can be done for example by using scalar-valued Riemannian Gaussian processes<sup id="fnref:gprm" role="doc-noteref"><a href="#fn:gprm" class="footnote" rel="footnote">3</a></sup> as building blocks.</p>
<p>By connecting differential-geometric cross-covariance kernels with Euclidean matrix-valued kernels, we can carry over standard Gaussian process techniques, such as variational approximations, into the differential-geometric setting.
Here, we show how to check such approximations to ensure they are geometrically consistent,<sup id="fnref:fm" role="doc-noteref"><a href="#fn:fm" class="footnote" rel="footnote">4</a></sup> and show in particular that the classical inducing point framework<sup id="fnref:vfe" role="doc-noteref"><a href="#fn:vfe" class="footnote" rel="footnote">5</a></sup> satisfies this.
This allows us to use variational approximations directly out of the box, with almost no modification to the code.</p>
<p>We illustrate this general procedure below.
Here, three scalar processes are combined to create a non-tangential vector-valued process in the embedded space, and projected to obtain a tangential vector field on the manifold.</p>
<div class="row justify-content-center">
<div class="col-10 col-md-4">
<img class="img-fluid" alt="Scalar processes" src="/assets/publications/2021-09-28-Vector-Field-GP/s2_xyz.png" />
<p class="text-center">
(a) Scalar processes
</p>
</div>
<div class="col-10 col-md-4">
<img class="img-fluid" alt="Embedded process" src="/assets/publications/2021-09-28-Vector-Field-GP/s2_ev.png" />
<p class="text-center">
(a) Embedded process
</p>
</div>
<div class="col-10 col-md-4">
<img class="img-fluid" alt="Projected process" src="/assets/publications/2021-09-28-Vector-Field-GP/s2_pr.png" />
<p class="text-center">
(a) Projected process
</p>
</div>
<div class="col-12">
<p class="text-center">
Illustration of Gaussian processes constructed using projected kernels.
</p>
</div>
</div>
<h1 id="example-probabilistic-global-wind-interpolation">Example: probabilistic global wind interpolation</h1>
<p>Here, we demonstrate a simplified example of the developed model on the problem of interpolating the global wind field from satellite observations.
We focus on the benefit of using a geometrically consistent model over a naïve implementation using Euclidean Gaussian processes.
Results are shown below.</p>
<div class="row align-items-center justify-content-center">
<div class="col-12 col-md-5 col-lg-4 text-center">
<img class="img-fluid" alt="Euclidean process on sphere" src="/assets/publications/2021-09-28-Vector-Field-GP/r2_sphere.png" />
</div>
<div class="col-12 col-md-6 col-lg-5 text-center">
<img class="img-fluid" alt="Euclidean process: plane" src="/assets/publications/2021-09-28-Vector-Field-GP/r2_flat.png" />
</div>
<div class="col-12 col-md-5 col-lg-4 text-center">
<img class="img-fluid" alt="Manifold process: sphere" src="/assets/publications/2021-09-28-Vector-Field-GP/s2_sphere.png" />
</div>
<div class="col-12 col-md-6 col-lg-5 text-center">
<img class="img-fluid" alt="Manifold process: plane" src="/assets/publications/2021-09-28-Vector-Field-GP/s2_flat.png" />
</div>
<div class="col-12">
<p class="text-center">
Wind interpolation using a Euclidean process (top) and Riemannian process (bottom).
</p>
</div>
</div>
<p>We see that the uncertainties in the Euclidean vector-valued GP become unnaturally distorted as the satellite approaches the poles, while the Riemannian case has a uniform band along the observations.
In addition, the Euclidean process gives rise to a spurious discontinuity in the uncertainty along the solid red line, which indicates the latitudinal boundary when projected onto the plane.
Such artifacts are avoided with a geometrically consistent model.</p>
<h1 id="summary">Summary</h1>
<p>We have developed techniques that enable Gaussian processes to model vector fields on Riemannian manifolds by providing a well-defined notion of such processes and then introducing an explicit method to construct them.
In addition to this, we have seen that most standard Gaussian process training methods, such as variational inference, are compatible with the geometry, hence can be used safely within our framework.
In an initial demonstration of our technique on the wind observation data, we have shown that it can be used successfully to interpolate global wind field with geometrically consistent uncertainty bars.
We hope that our work inspires the use of Gaussian processes as easy and flexible means of modelling vector fields on manifolds in a variety of applications.</p>
<h2 id="references">References</h2>
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:rna" role="doc-endnote">
<p>More precisely, a Euclidean Gaussian process is a stochastic process \(f : \Omega \times X \to \mathbb{R}^d\), where \(\Omega\) is the probability space. We omit this from notation for conciseness. Similarly, a Gaussian vector field is a map \(f : \Omega \to \Gamma(TX)\). <a href="#fnref:rna" class="reversefootnote" role="doc-backlink">↩</a> <a href="#fnref:rna:1" class="reversefootnote" role="doc-backlink">↩<sup>2</sup></a> <a href="#fnref:rna:2" class="reversefootnote" role="doc-backlink">↩<sup>3</sup></a></p>
</li>
<li id="fn:ne" role="doc-endnote">
<p>Embedding manifolds into higher-dimensional Euclidean spaces is always possible by the <em>Nash embedding theorem</em>. <a href="#fnref:ne" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:gprm" role="doc-endnote">
<p>V. Borovitskiy, P. Mostowsky, A. Terenin, and M. P. Deisenroth. Matérn Gaussian processes on Riemannian Manifolds. NeurIPS, 2020. <a href="#fnref:gprm" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:fm" role="doc-endnote">
<p>To be geometrically consistent, a vector field represented numerically needs to be <em>equivariant</em> under a <em>change of frame</em>. A <em>frame</em> in differential geometry is an object that provides a coordinate system on the tangent spaces which we can use to study vector fields using the language of linear algebra. A truly geometric object such as a vector field should not depend on the choice of a frame. <a href="#fnref:fm" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:vfe" role="doc-endnote">
<p>M. Titsias. Variational Learning of Inducing Variables in Sparse Gaussian Processes. AISTATS, 2009. <a href="#fnref:vfe" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>Michael Hutchinson,* Alexander Terenin*, Viacheslav Borovitskiy,* So Takao,* Yee Whye Teh, and Marc Peter DeisenrothGaussian processes are machine learning models capable of learning unknown functions with uncertainty. Motivated by a desire to deploy Gaussian processes in novel areas of science, we present a new class of Gaussian processes that model random vector fields on Riemannian manifolds that is (1) mathematically sound, (2) constructive enough for use by machine learning practitioners and (3) trainable using standard methods such as inducing points. In this post, we summarize the paper and illustrate the main results and ideas.A Brief Tutorial on Multi-armed Bandits2021-09-24T00:00:00+00:002021-09-24T00:00:00+00:00https://avt.im/talks/2021/09/24/Bandits-Tutorial<p>Multi-armed bandits are a class of sequential decision problems which include uncertainty. One of their defining characteristics is the presence of explore-exploit tradeoffs, which require one to balance taking advantage of information that is known with trying different options in order to learn more information in order to make optimal decisions. In this tutorial, we introduce the problem setting and basic techniques of analysis. We conclude by discussing how explore-exploit tradeoffs appear in more general settings, and how the ideas presented can aid in understanding of areas like reinforcement learning.</p>Alexander TereninMulti-armed bandits are a class of sequential decision problems which include uncertainty. One of their defining characteristics is the presence of explore-exploit tradeoffs, which require one to balance taking advantage of information that is known with trying different options in order to learn more information in order to make optimal decisions. In this tutorial, we introduce the problem setting and basic techniques of analysis. We conclude by discussing how explore-exploit tradeoffs appear in more general settings, and how the ideas presented can aid in understanding of areas like reinforcement learning.Matérn Gaussian Processes on Graphs2021-07-14T00:00:00+00:002021-07-14T00:00:00+00:00https://avt.im/talks/2021/07/14/Graph-Matern-GP<p>Gaussian processes are a versatile framework for learning unknown functions in a manner that permits one to utilize prior information about their properties. Although many different Gaussian process models are readily available when the input space is Euclidean, the choice is much more limited for Gaussian processes whose input space is an undirected graph. In this work, we leverage the stochastic partial differential equation characterization of Matérn Gaussian processes—a widely-used model class in the Euclidean setting—to study their analog for undirected graphs. We show that the resulting Gaussian processes inherit various attractive properties of their Euclidean and Riemannian analogs and provide techniques that allow them to be trained using standard methods, such as inducing points. This enables graph Matérn Gaussian processes to be employed in mini-batch and non-conjugate settings, thereby making them more accessible to practitioners and easier to deploy within larger learning frameworks.</p>Alexander TereninGaussian processes are a versatile framework for learning unknown functions in a manner that permits one to utilize prior information about their properties. Although many different Gaussian process models are readily available when the input space is Euclidean, the choice is much more limited for Gaussian processes whose input space is an undirected graph. In this work, we leverage the stochastic partial differential equation characterization of Matérn Gaussian processes—a widely-used model class in the Euclidean setting—to study their analog for undirected graphs. We show that the resulting Gaussian processes inherit various attractive properties of their Euclidean and Riemannian analogs and provide techniques that allow them to be trained using standard methods, such as inducing points. This enables graph Matérn Gaussian processes to be employed in mini-batch and non-conjugate settings, thereby making them more accessible to practitioners and easier to deploy within larger learning frameworks.Physically Structured Neural Networks for Smooth and Contact Dynamics2021-07-14T00:00:00+00:002021-07-14T00:00:00+00:00https://avt.im/talks/2021/07/14/Physically-Structured-Networks<p>A neural network’s architecture encodes key information and inductive biases that are used to guide its predictions. In this talk, we discuss recent work which leverages the perspective of neural ordinary differential equations to design network architectures that encode the structures of classical mechanics. We examine the cases of both smooth dynamics and non-smooth contact dynamics. The architectures obtained are easy to understand, show excellent performance and data-efficiency on simple benchmark tasks, and are a promising emerging tool for use in robot learning and related areas.</p>Alexander TereninA neural network’s architecture encodes key information and inductive biases that are used to guide its predictions. In this talk, we discuss recent work which leverages the perspective of neural ordinary differential equations to design network architectures that encode the structures of classical mechanics. We examine the cases of both smooth dynamics and non-smooth contact dynamics. The architectures obtained are easy to understand, show excellent performance and data-efficiency on simple benchmark tasks, and are a promising emerging tool for use in robot learning and related areas.Gaussian Processes on Riemannian Manifolds for Robotics2021-06-21T00:00:00+00:002021-06-21T00:00:00+00:00https://avt.im/talks/2021/06/21/Riemannian-Matern-GP<p>Gaussian processes are an effective model class for learning unknown functions, particularly in settings where accurately representing predictive uncertainty is of key importance. Motivated by applications in the physical sciences, the widely-used Matérn class of Gaussian processes has recently been generalized to model functions whose domains are Riemannian manifolds, by re-expressing said processes as solutions of stochastic partial differential equations. In this work, we propose techniques for computing the kernels of these processes on compact Riemannian manifolds via spectral theory of the Laplace–Beltrami operator in a fully constructive manner, thereby allowing them to be trained via standard scalable techniques such as inducing point methods. We also extend the generalization from the Matérn to the widely-used squared exponential Gaussian process. By allowing Riemannian Matérn Gaussian processes to be trained using well-understood techniques, our work enables their use in mini-batch, online, and non-conjugate settings, and makes them more accessible to machine learning practitioners.</p>Alexander TereninGaussian processes are an effective model class for learning unknown functions, particularly in settings where accurately representing predictive uncertainty is of key importance. Motivated by applications in the physical sciences, the widely-used Matérn class of Gaussian processes has recently been generalized to model functions whose domains are Riemannian manifolds, by re-expressing said processes as solutions of stochastic partial differential equations. In this work, we propose techniques for computing the kernels of these processes on compact Riemannian manifolds via spectral theory of the Laplace–Beltrami operator in a fully constructive manner, thereby allowing them to be trained via standard scalable techniques such as inducing point methods. We also extend the generalization from the Matérn to the widely-used squared exponential Gaussian process. By allowing Riemannian Matérn Gaussian processes to be trained using well-understood techniques, our work enables their use in mini-batch, online, and non-conjugate settings, and makes them more accessible to machine learning practitioners.