Upcoming Seminars

December - Winter Break.

January 18, 2022
9 AM Pacific / 12 PM Eastern / 5 PM Dublin

Anirudh Koul
Pinterest, NASA Frontier Development Lab

SpaceML Worldview Search: The NoCode Earth & Natural Disaster Dataset Curator from Unlabeled Petabyte Scale Imagery

AI modeling for Earth events at NASA is often limited by the availability of labeled examples. For example, training classifiers to detect forest fires from satellite imagery requires curating a massive and diverse dataset of example forest fires, a tedious multi-month effort requiring careful review of over 196 million square miles of data per day for 20 years. While such images might exist in abundance within 40 petabytes of unlabeled satellite data, finding these positive examples to include in a training dataset for a machine learning model is extremely time-consuming and requires researchers to "hunt" for positive examples, like finding a needle in a haystack. In this presentation, we showcase a no-code open-source tool built by an international team of citizen scientists whose goal is to minimize the amount of human manual image labeling needed to achieve a state-of-the-art classifier. The pipeline, purpose-built to take advantage of the massive amount of unlabeled images, consists of (1) self-supervision training to convert unlabeled images into meaningful representations, (2) search-by-example to collect a seed set of similar images, (3) human-in-the-loop active learning to iteratively ask for labels on uncertain examples and train on them. In initial experiments, the system has yielded orders of magnitude reduction in time and cost of data labeling efforts and has shown the potential to multiply the efficiency of the researcher's data curation efforts.

Previous Seminars

SEMINAR SERIES 2021 - 2022


September 28, 2021

Special Seminar - Cross-Listed with the OpenPlanetary Seminar Series

Annie Didier
NASA Jet Propulsion Laboratory

Incepting Interplanetary “Google Search” through Machine Learning

Spacecraft can produce a far greater volume of data than can be downlinked. Though interplanetary communication rates have grown in orders of magnitude since early missions, they are far surpassed by the growth in data volume produced by on-board instruments. To bypass the communication bottleneck between spacecraft and ground while optimizing scientific yield, we propose the concept of ‘Interplanetary Google Search,’ a novel approach to spacecraft data retrieval inspired by the Google search engine. We envision a selective downlink capability with on-board indexing and search where scientists can query a spacecraft’s on-board database for specific, relevant information. To realize this on-board data storage and indexing vision, we must first introduce the means to extract features relevant to scientific interest from historic data payloads. The key to our approach is utilizing machine learning to extract features and summarize data. We have demonstrated this capability using image segmentation and image captioning models of MSL RGB imagery. Such methods would, for instance, enable scientists to download a full textual summary of the imagery taken by a rover and use this information to downlink data with specific features for further analysis. This concept has even more potential for a wider range of deep-space missions with more data-intensive instruments (e.g. ground-penetrating radar and hyperspectral imagers), and can be realized with the application of data science.

October 26, 2021

Mayur Bakrania
University College London

Applying unsupervised learning and outlier detection methods to characterise magnetotail electrons

Collisionless space plasma environments are characterised by distinct particle populations that typically do not mix. Although moments of their velocity distributions help in distinguishing different plasma regimes, the distribution functions themselves provide more comprehensive information about the plasma state. Unlike moments, however, distributions are not easily characterised by a small number of parameters, making their classification more difficult. To perform this classification, we distinguish between the different plasma regions by applying dimensionality reduction and clustering methods to electron distributions in pitch angle and energy space. The automated classification of different regions in space plasma environments provides a useful tool to identify the physical processes governing particle populations in near-Earth space. Using outlier detection methods, we can identify anomalous distributions in the magnetotail that are consistent with simulations of the tearing instability.

Jay Laura
U.S. Geological Survey

Planetary Spatial Data Infrastructure and Analysis Ready Data

Spatial data efficacy is a primary concern for any scientists making use of spatial data products. The field of planetary spatial data infrastructure (PSDI) includes work to classify and identify foundational data products with well communicated spatial accuracies to ensure appropriate data use. Under the auspices of PSDI, the nascent push for analysis ready data (ARD) provides a rich opportunity space where data sets are both made available in machine learning ready formats and the ML community is actively engaged to improve the discoverability of said data. I will present a brief overview of PSDI and planetary ARD as they relate to potential ML activities.

November 23, 2021

Matthew Cheng
University College London

Automated bow shock and magnetopause boundary detection with Cassini

The Cassini mission spent 13 years in orbit around Saturn collecting a wealth of data about its magnetic field and plasma populations. A catalogue of bow shock and magnetopause boundary crossings is needed to study the structure of Saturn’s magnetosphere and fundamental plasma processes like instabilities. However, a very challenging aspect is the manual identification of thousands of crossings. It is both time-consuming and prone to human error. This calls for ways to standardize the detection of these boundaries through automation. A range of techniques is explored from traditional time-series analysis on the magnetic field and plasma moments data to modern machine learning techniques like evidential deep learning which quantify classification uncertainty applied to electron energy spectrograms.

Alexander Barrett
The Open University

Characterizing the ExoMars landing site at Oxia Planum using deep learning

In this study a Deep Learning Convolutional Neural Network was trained to characterize the landscape of Oxia Planum, Mars. This work was conducted as part of the preparation for the European Space Agency’s upcoming ExoMars Rosalind Franklin Rover mission. The aim was to develop an ontology of terrain classes, which could be applied across the landing site and beyond, to form a component to traversability analysis, when combined with engineering information about the rover, and with “ground truth” information about how the classes look once Rosalind Franklin lands. Studies combining high resolution images with large spatial extents, such as the full 3-sigma potential landing ellipse, present significant challenges. Mapping or surveying large areas at full resolution becomes massively time consuming, or requires very large teams to do effectively. One solution is using machine learning systems to classify images by semantic segmentation. This approach can provide a useful tool to augment the workflow of human geomorphologists, “triaging” extremely large, high resolution datasets to identify concentrations of certain textures or landforms. I will discuss the process of developing the Oxia planum training dataset, and our assessment of the results.

SEMINAR SERIES 2020 - 2021


September 22, 2020

Abigail Azari & Caitriona Jackman
ML4PSP Organizers

Introductions to the ML4PSP Series &
Integrating ML for Planetary Science In the Next Decade [paper]

The ML4PSP organizers will discuss the series and provide introductions before summarizing a recent white paper submitted to the NRC Planetary and Astrobiology decadal on integrating machine learning into planetary science.

Matthew K. James
University of Leicester

3D modelling of Mercury's magnetosphere
using the new MESSENGER FIPS proton moments [paper]

A new MESSENGER FIPS dataset is introduced, where the 𝜅-distribution function is fitted numerically to proton spectra, providing more accurate estimates of density and temperature than previous Maxwellian fits. The quality of the fitted distribution functions are then assessed using modular artificial neural networks in order to remove badly fitted spectra. The new moments are then used to train a deep artificial neural network in order to create a scalable 3D proton model for Mercury's magnetosphere.

October 27, 2020

Kiri Wagstaff
NASA Jet Propulsion Laboratory /
Oregon State University

Machine Learning for Spacecraft at Europa: Enabling In-Situ Discoveries to Maximize Science Return [2019 paper], [2020 paper]

Upcoming missions to remote destinations like Jupiter's moon Europa will operate at extreme distances from the Earth where direct human oversight is impossible. The combination of extreme distance, limited lifetime due to high radiation, and limited data downlink creates an urgent need for reliable autonomous operations. Machine learning can help by analyzing data for features of interest as it is collected. Data with positive detections can be marked for high-priority downlink to Earth for mission planning. For Europa, such features include thermal anomalies, active icy plumes, and unusual surface mineral deposits. This talk describes data analysis and machine learning methods that can operate onboard to increase the rate of exploration and discovery.

Matthew Argall
University of New Hampshire

The MMS SITL Ground Loop: Automating the Burst Selection Process [paper], [book chapter]

Global-scale energy flow throughout Earth's magnetosphere is catalyzed by processes that occur at Earth's magnetopause (MP). Magnetic reconnection is one process responsible for solar wind entry into and global convection within the magnetosphere, and the MP location, orientation, and motion have an impact on the dynamics. Statistical studies that focus on these and other MP phenomena and characteristics inherently require MP identification in their event search criteria, a task that can be automated using machine learning. We introduce a Long-Short Term Memory (LSTM) Recurrent Neural Network model into the operational data stream of the Magnetospheric Multiscale (MMS) mission to free up mission operation costs, detect MP crossings, and assist studies of energy transfer into the magnetosphere.

November 24, 2020

Kiley Yeakel
The Johns Hopkins University
Applied Physics Laboratory

Machine Learning Algorithms for Automated Detection of Boundary Crossings: A Case Study from Cassini

As increasingly data-intensive sensors are developed for downlink-constrained deep-space missions, scientists face a future in which only a small portion of the science data collected by the spacecraft can be sent back to Earth. There’s a rapidly increasing need to develop “smart” autonomous algorithms capable of rudimentary science analysis on-board the spacecraft so that the downlink bandwidth can be optimized for the most relevant observations. Here, we present one such case study from the Cassini mission where we have utilized machine learning (ML) algorithms to classify whether the spacecraft was in the magnetosphere, magnetosheath or solar wind, utilizing a set of labeled magnetopause and bow shock crossings spanning from 2004 – 2016. We analyze the overall accuracy of various ML algorithms – Recurrent Neural Networks (RNNs) and Gaussian Mixture Models (GMMs) – utilizing combinations of features from the magnetometer (MAG), Charge Mass Spectrometer (CHEMS) and Low-Energy Magnetospheric Measurement System (LEMMS).

Mario Morvan
University College London

Using Deep Learning for Precision Photometry in Exoplanetary Science [2020a paper], [2020b preprint]

Disentangling the planetary signal from the stellar and instrumental noise is a major data and modelling challenge with inevitable repercussions for transits detection and characterisation. Here we consider approaches to leverage the power of deep learning and help tackling this challenge. After discussing a LSTM-based method to model the noise in Spitzer light curve observations, we present preliminary studies aiming at developing an end-to-end differentiable pipeline combining the flexibility and scalability of neural networks with the precision and domain knowledge borne by physical models.

January 26, 2021

Special Theme: Model Metrics and Assessment

Michael Liemohn
University of Michigan

One is Not Enough
Thoughts on Choosing Data-Model Comparison Metrics

The magnetospheric physics research community uses a broad array of quantitative data-model comparison methods – metrics – when conducting their research investigations. It is often the case, though, that any particular study will only use one or two metrics. Because metrics are designed to test a specific aspect of the data-model relationship, limiting the comparison to only one or two metrics reduces the physical insights that can be gleaned from the analysis, restricting the possible findings from such studies. Additional physical insights can be obtained when many types of metrics are applied. A few best practices for choosing metrics for space physics studies are presented and discussed.

Sophie Murray
Dublin Institute of Advanced Studies

Finding the Right Metric
Solar Flare Forecast Evaluation [paper], [paper], [flare scoreboard]

One essential component of operational space weather forecasting is the prediction of solar flares. Early flare forecasting work focused on statistical methods based on historical flaring rates, and more complex machine learning based methods have been implemented in recent years. A multitude of flare forecasting methods are now available for operational use, and proper evaluation of these products is crucially important for model developers, forecasters, end-users, and stakeholders because it facilitates an understanding of the strengths and weaknesses of the forecasting process. This talk will outline current collaborative efforts in solar flare forecasting that are driving international standards based on terrestrial weather forecasting practices, such as defining evaluation metrics, climatological benchmarking, and ensemble requirements.

February 23, 2021

Tadhg Garton
University of Southampton /
Alan Turing Institute

Machine Learning identification of signatures in 1D magnetospheric timeseries

The products of magnetic reconnection in Saturn's magnetotail are identified in magnetometer observations primarily through characteristic deviations in the north-south component of the magnetic field. Identification of these features has long been performed by human observers, however with the advent of sophisticated computational methods, it is time to automate our search for these reconnection signatures. Here, we present a fully automated, supervised learning, feed forward neural network model to identify evidence of reconnection in the Kronian magnetosphere with the three magnetic field components observed by the Cassini spacecraft in Kronocentric radial-theta-phi (KRTP) coordinates as input. Furthermore, we present methods to validate results of machine learning algorithms when they are applied to extended datasets that originate in differing background environments than those trained, tested and validated against.

Lior Rubanenko
Stanford University

Automatic detection of barchan dunes on Mars employing an instance segmentation neural network

The surface of Mars is riddled with dunes created by accumulating sand particles that are carried by the wind. When the sand supply is limited and the wind is approximately unidirectional, dunes take the form of crescents termed barchan dunes, whose slip faces are oriented in the dominant wind direction. Consequently, analyzing the morphometrics of barchan dunes can help characterize the winds that form them. Previously, local circulation patterns were derived by analyzing individual images of barchan dunes near the North Pole of Mars [1]. However, repeating this analysis on a global scale remains a challenge, as manually mapping dunes is largely impractical and traditional computer vision algorithms are largely ineffective at identifying the outlines of dunes from images. Here we employ Mask R-CNN [2], an instance segmentation convolutional neural network, to map dunes across the surface of Mars. Training on ~1000 images, our model achieves a mean average detection precision (mAP) of 80%, for IoU = 0.5. In the talk, I will describe the Mask R-CNN neural network and its vast space of hyperparameters, and how those can be employed for object detection and analysis by incorporating traditional computer vision techniques.

March 23, 2021

Lasse Clausen
University of Oslo

Auroral images: Automatic classification and geomagnetic predictions [paper]

We use a pre-trained deep neural network to automatically extract features from auroral images. Using a manually labelled training dataset, we are then able to automatically classify images into one of the following classes: "clear", "cloudy", "moon", "arc", "diffuse", and "discrete". As a next step, we show some initial results from our attempts to use the extracted features to predict magnetometer observations.

Hannah Kerner
University of Maryland

Novelty-Guided Target Selection for Mars Rovers

Mars rover operations currently consists of pre-scripted commands determined by the rover science and engineering operations teams on a day-to-day (or sol-to-sol) basis. Automated instrument targeting systems, which determine which surface features to target based on automated rather than pre-scripted decision-making, could help increase the science return from current and future exploration missions. To enable automatic follow-up observation of novel targets—i.e., targets that differ substantially from those observed previously in the mission—we propose to use novelty detection algorithms for ranking candidate targets detected in rover images. In this talk, I will present our proposed onboard novelty detection framework and illustrate its utility using diverse scenarios of novel geology found in Mars rover images.

April 27, 2021

Jacob Bortnik
University of California, Los Angeles

Machine learning reconstruction of the inner magnetosphere

While the volumes of space physics data continue to rise exponentially, our analysis techniques have not kept pace with this rapid growth, and often do not exploit the full potential of the data. In this talk, we discuss how machine learning might provide the solution to this problem, and in particular we will show how a (sparse) time-series of point observations of some quantity can be converted into a 3-dimensional time-varying model of that quantity with the use of neural networks. As an example, we show a three-dimensional dynamic electron density (DEN3D) model in the inner magnetosphere, that can provide full coverage of the inner magnetosphere and in fact is sufficiently accurate that it points the way to new physical discoveries. The talk will be concluded with a few emerging ideas of how machine learning can be applied in the physical sciences.

Téo Bloch
University of Reading

Deep-Ensemble Modelling of Electron Flux at the Radiation Belt’s Outer Boundary With Bayesian Neural Networks [website]

As space-based infrastructure becomes more ubiquitous, modelling the radiation belts is increasingly important. Most radiation belt models require an accurate outer boundary condition, as this helps to drives the simulation. Our work aims to characterise the flux-energy distribution at the outer boundary location using an ensemble of Bayesian neural networks. Each model in the ensemble predicts 11 values of flux, and the associated variance, for each set of inputs. The model performs well, predicting fluxes within a factor of 2.5 for the lower energies and within a factor of 4 for the higher energies (with a correlation between 0.5-0.8).

May 25, 2021

Ryan McGranaghan
Atmospheric and Space Technology Research Associates (ASTRA)

Advancing space physics research through machine learning and information representation: A powerful and demonstrable use case through solar wind-magnetosphere-ionosphere coupling

The connection between the Sun and the Earth is a complex one, involving interactions and variabilities across a dizzying spectrum of scales and systems. The result is a relationship between us and our star that is observable only through a fleet of instruments, methods, and technologies yet creates weather in the near-Earth space environment colloquially known as space weather. Space weather is the impact of solar energy on society and is a powerful use case for demonstrating the ability of data science to converge disciplines. A key to understanding it is the way that regions of space between the Sun and the Earth’s surface are connected, particularly via particles transferred from the solar wind to the magnetosphere to the upper atmosphere—a problem that remains one of the great challenges in space physics. We will first present a new ML model that better captures the dynamics of the particle precipitation from a large volume of data. We will then share a new framework to evaluate and understand these models. We will generalize this progress as suggestive of trends that reverberate across all scientific disciplines and that even tie science to engineering, art, and design. We will raise generative questions about how we currently do AI/ML research in space physics, about the role of information representation in progressing space physics discovery (e.g., knowledge networks), and to provide insight to and spark discussion for this cross-disciplinarity community around the concepts of convergence and antidisciplinary.

Giacomo Nodjoumi
Jacobs University

Detecting Cave Entrance Candidates on Mars using Deep Learning Computer Vision

In the framework of geological exploration of terrestrial planets, sinkhole-like landforms (pit craters, pit chains and skylights), as a potential direct access to the subsurface, are one of the most promising environments to focus our research. Detecting, mapping, and describing those types of landforms is a challenging process since a set of tedious tasks must be conducted manually by researchers, usually on a small set of available data. These tasks vary from data collection (in which areas with high probability of occurrences are selected and downloaded) to manual analysis that requires viewing the images in detail, mapping all occurrences with GIS, and extracting morphometric parameters. For the Moon and Mars, databases of cave candidates exist (see MGC^3 Cushing, 2012) but all of these databases are focused on small regions, rather than at planetary scale. Thus there are possible missing correlations between the presence of these landforms and the area of detection that can be related to past and present processes. To achieve data analyses at planetary scale, machine learning and deep learning algorithms are extremely valuables techniques, capable of automatically analysing large datasets. The main problem is that often it is necessary to develop specific tools and pipelines for this task. Two jupyter notebooks have been developed around FOSS Object Detection packages and used with a small dataset of 130 high resolution images acquired by the HiRISE camera from the Mars Reconnaissance Orbiter. The aim of these is to create an user-friendly environment for training and evaluating an object detection model; not only for cave candidates but also for other types of landforms. These results have been compared to available databases with preliminary promising results.

June 29, 2021

Victor Pinto
University of New Hampshire

Reproducibility in Space Sciences
Do we really need to publish our codes?

Reproducibility is without a doubt one of the fundamental pillars of science, and just as science has evolved through time, so has the concept of what is necessary to ensure results are reproducible. Historically, the publication of source code used for analysis, modeling, and even for figures has never been a requirement in Space Sciences and it has rarely been encouraged as a necessary practice. A possible explanation may be the assumption that reproducibility is “baked in” the physics of the systems or in the data. However, as Machine Learning becomes an integral part of Space Sciences research, the requirement that the codes developed for research are publicly available, or at least can be obtained on demand grows in popularity. Today, many publishers are encouraging code sharing, or even straight up requiring it now or in the near future. This talk will stray away from the traditional discussions on techniques and applications of machine learning for space and planetary sciences and focus on the topics of whether code sharing is the best practice to ensure reproducibility, with the idea of starting a conversation on whether we should push for or against code sharing as a community, and how should we prepare for it if it becomes the norm moving forward.

Stéphane Aicardi
Observatoire de Paris

Deep Learning on Jovian Decametric Emissions

Jupiter's decametric radio emissions have long been a valuable tool for understanding the magnetic properties of Jupiter and its interactions with its major satellites. Decametric emissions are variable on many timescales, but high-resolution observations produce huge data that can't be stored for long. Using convolutional neural networks, we try to detect, classify and locate Jovian emissions in the observations by the Nançay decameter array. This will lead to an automated method to select the high resolution data to be archived. See predictions from this network online.

July 27, 2021

John Biersteker
Massachusetts Institute of Technology

Probing Europa's interior with Bayesian inference and magnetic induction

Exploring subsurface oceans on icy moons is a key goal in the search for habitable environments. Because these moons are embedded in the time-varying magnetosphere of their host planet, their saltwater oceans generate an induced magnetic field detectable by spacecraft. Such measurements provide a tool to detect and characterize these ocean worlds. As part of the design of the upcoming Europa Clipper mission, we have developed a technique for ocean characterization using spacecraft magnetometry and Bayesian inference. This approach allows for the recovery of the ocean parameters with robust uncertainties and enables incorporating multiple datasets into a self-consistent view of Europa's interior. I will discuss the application of this technique to the upcoming Europa Clipper mission, archival data from Galileo, and possible future missions to the ice giants.

Hazel Bain
CIRES University of Colorado Boulder
NOAA Space Weather Prediction Center

How Well Can We Forecast Solar Radiation Storms?

Solar energetic particles (SEPs) are a driver of space weather, the effects of which can impact high-frequency communications systems, satellite systems and pose a radiation hazard for astronauts, as well flight crew and passengers on polar flight routes. The National Oceanic and Atmospheric Administration's Space Weather Prediction Center (NOAA/SWPC) issues space weather forecasts and products for energetic protons at Earth. I will discuss a recent verification study to assess our ability to forecast these storms. It is hoped that this study will serve as a benchmark for the development and validation of physics-based and machine learning SEP models. Co-authors: R. Steenburgh (NOAA SWPC), T. Onsager (NOAA SWPC), E. M. Stitely (Millersville University)