Samuel Skillman
Descartes Labs; Fellow 2009-2013
As a student and postdoctoral researcher, Samuel Skillman studied distant galaxy clusters. Now the Department of Energy Computational Science Graduate Fellowship (DOE CSGF) alumnus seeks ways to access and analyze information on our home planet.
“The joke was I used to look up toward the cosmos. Now I look the other way around,” says Skillman, a fellow from 2009 to 2013.
Skillman is head of engineering at Descartes Labs, a Santa Fe company with origins in nearby Los Alamos National Laboratory (LANL). He joined in early 2015, a few months after the company launched.
In those early days, Skillman was “basically a firefighter. You’re putting out fires in the system. You’re trying to make things more fireproof. You’re also trying to not start any fires.” Since then, Descartes Labs has established itself as a platform for geospatial data from government and commercial sources – satellite imagery, weather observations and unconventional sensor information like maritime shipping transponders.
These data are spread across servers worldwide, with varying protocols to access and analyze them. Descartes Labs uses cloud computing to gather and preprocess the information and builds application program interfaces (APIs) for users to retrieve and analyze it. “It’s having access to all that data through the same API instead of hundreds of different APIs, latencies and bandwidths,” Skillman says.
For example, Skillman was principal investigator for a Geospatial Cloud Analytics (GCA) contract with the Defense Advanced Research Projects Agency. As the name suggests, the program explored how combining commercial cloud computing with public and private remote sensing data can enable new ways to identify global geographic trends. GCA teams used the Descartes Labs platform to access and study geospatial data, addressing such problems as assessing food security and tracking hydraulic fracturing – fracking – for hydrocarbon extraction.
“One of the main enabling technologies is having access to all of the different data sources and modalities,” Skillman says, such as Automatic Identification System (AIS) information. AIS transponders report positions for maritime ships, helping avoid collisions, but “also can be used as ways to track nefarious activities,” such as vessels going dark as they enter protected fishing areas.
“With the right tools, you can start to identify patterns and develop predictive models,” Skillman says. “You might be able to combine that with other data sources, perhaps satellite photos of ship locations.”
Meanwhile, researchers also are creating frameworks that make working with geospatial data easier for subject-matter experts. “They may not know how to pull in satellite data and work with that to extract useful signals,” Skillman adds, but have knowledge that directs their searches. “Can we build an abstraction that sits on top of our core APIs that lets them access those data more seamlessly and simply?”
The goal: cut the time it takes to develop a hypothesis, access and analyze data, and verify or discard an idea.
Skillman is familiar with the discovery cycle. As a University of Colorado Boulder Ph.D. student and a postdoctoral researcher at Stanford University’s Kavli Institute for Particle Astrophysics and Cosmology, he simulated galaxy cluster formation and interaction on supercomputers. Each produced huge data sets – similar in size to those Descartes Labs handles – that required new tools to analyze and visualize.
At Stanford, Skillman collaborated with LANL scientist Michael Warren, who had mentored him during a Science Undergraduate Laboratory Internship. Warren left the lab to co-found Descartes Labs and invited Skillman to meet the team.
Although Skillman thought he might work in industry someday, he didn’t expect the opportunity to arise so soon. The idea of collaborating with a respected friend and joining a company focused on data and physical phenomena was appealing. “I didn’t see that opportunity coming around any time soon after that, so I decided to go for it.”
His job gives Skillman the chance to hire and work with smart, engaging scientists and software engineers. It also lets him delve into problems that combine large-scale data and computing with visualization and analysis – an intersection he calls his “happy place.”
For example, soon after joining the company he and his colleagues ran a program that used 16 hours on 30,000 cloud-computing cores to take in and preprocess a petabyte of satellite data gathered over a 30-year span. As the number and variety of remote sensing satellites grows, the industry and research community must be ready to process as much as a petabyte every day.
“Thinking about data at that scale is an incredible task,” Skillman says. How to help analyze them is “an interesting and challenging problem. I kind of like those.”
This article also appears in the 2020-21 issue of DEIXIS, The DOE CSGF Annual.