Jul 2 – 6, 2018
Žofín Palace
Europe/Prague timezone

P2.1007 Automatic Robust Regression Analysis of Fusion Plasma Experiment Data based on Generative Modelling

Jul 3, 2018, 2:00 PM
2h
Mánes

Mánes

Speaker

Keisuke Fujii

Description

See the full Abstract at http://ocs.ciemat.es/EPS2018ABS/pdf/P2.1007.pdf Automatic Robust Regression Analysis of Fusion Plasma Experiment Data based on Generative Modelling K. Fujii1 , C. Suzuki2 , and M. Hasuo1 1 Department of Mechanical Engineering and Science, Graduate School of Engineering, Kyoto University, Kyoto 615-8540, Japan 1 National Institute for Fusion Science, Gifu 509-5292, Japan The first step to realize an automatic data analysis for fusion plasma experiment is automat- ically fitting noisy data measured routinely. A textbook example of fitting procedures is the minimization of the squared difference between the measured data and some parameterized functions such as polynomial. This model implicitly assumes that both the noise distribution and the latent function form are already known, however, it is frequently not the case for the real world data analysis. Using the conventional model in such situatiln easily results in over- or under-fitting, and therefore some human supervision has been usually necessary. In this work, we propose to optimize a model itself to stabilize the analysis. Based on Bayesian statistics, the goodness of a model M for particular (k-th) data y(k) can be measured by the merginal likelihood, ∫ p(y(k) |M ) = p(y(k) |θ , M )p(θ |M )d θ (1) where, p(y(k) |θ ) is likelihood of data y(k) with given fitting parameter θ . The form of the likeli- hood (noise distribution and form of the latent function) is implicitly included in the likelihood and the prior distribution p(θ |M ). The robustness of the model M might be measured by an expectation of this merginal like- lihood, E p(y) [log p(y|M )], where p(y) is the true distribution of y that will generate data in the future. We show that the maximization of this expectation is identical to the minimization of Kullback-Leibler divergence between the true data distribution p(y) and the modeled data distribution p(y|M ), and therefore the unbiased generative modeling is essential. A strategy we propose here is to construct a flexible generative model, i.e. the latent function form and the noise distribution, with neural networks and optimize their weights to fit our gen- erative model to a large amount of data. We applied this strategy to Thomson scattering data in Large Helical Device and found that our model outperforms the conventional analysis methods that does not take into account the data distribution, especially in terms of the robustness.

Primary author

Presentation materials

There are no materials yet.