<?xml version="1.0" encoding="UTF-8"?>
<article xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="1.1" xml:lang="en">
  <front>
    <journal-meta>
      <journal-id>authorea</journal-id>
      <publisher>
        <publisher-name>Authorea</publisher-name>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.22541/essoar.170365226.61399695/v2</article-id>
      <title-group>
        <article-title>Extracting latent variables from forecast ensembles and advancements in
similarity metric utilizing optimal transport</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author" corresp="yes">
          <contrib-id contrib-id-type="orcid">0000-0001-9457-7457</contrib-id>
          <name>
            <surname>Nishizawa</surname>
            <given-names>Seiya</given-names>
          </name>
          <address>
            <institution>RIKEN Center for Computational Science</institution>
          </address>
        </contrib>
      </contrib-group>
      <pub-date date-type="preprint" publication-format="electronic">
        <day>16</day>
        <month>2</month>
        <year>2024</year>
      </pub-date>
      <self-uri xlink:href="https://doi.org/10.22541/essoar.170365226.61399695/v2">This preprint is available at https://doi.org/10.22541/essoar.170365226.61399695/v2</self-uri>
      <abstract abstract-type="abstract">
        <p>This study presents a novel methodology for extracting latent
variables from high-dimensional sparse data, particularly emphasizing
spatial distributions such as precipitation distribution. This approach
utilizes multidimensional scaling with a distance matrix derived from a
new similarity metric, the Unbalanced Optimal Transport Score (UOTS).
UOTS effectively captures discrepancies in spatial distributions while
preserving physical units. This is similar to mean absolute error,
however it considers location errors, providing a more robust measure
crucial for understanding differences between observations, forecasts,
and ensembles. Probability distribution estimation of these latent
variables enhances the analytical utility, quantifying ensemble
characteristics. The adaptability of the method to spatiotemporal data
and its ability to handle errors suggest its potential as a promising
tool for diverse research applications.</p>
      </abstract>
      <kwd-group kwd-group-type="author-created">
        <kwd>Dimension Reduction</kwd>
        <kwd>ensemble forecast</kwd>
        <kwd>informatics</kwd>
        <kwd>location errors</kwd>
        <kwd>meteorology</kwd>
        <kwd>optimal transport</kwd>
        <kwd>similarity metric</kwd>
      </kwd-group>
    </article-meta>
  </front>
</article>
