Richie Cotton, resident Data Evangelist at DataCamp, recently interviewed Justin Fletcher, advisor and subject matter expert on the enterprise adoption of deep learning technologies, The Air Force Space Command, Space and Missile Center, Special Projects Directorate, and Captain in the United States Air Force Reserves
Introducing Justin Fletcher
Richie Cotton: Hi everyone. Welcome to Data Framed, It's World Space Week. So today we're talking about using data in space research. I'm extra especially excited about today's episode cause I've been interested in space since I was a kid. My grandma used to live near the Jodrell Bank Radio Telescope in the UK, and when I went to visit her, going to Jo Bank was my favorite day out space is, of course, literally a huge topic to cover, so rather than trying to go over all of it, in this episode, we're focusing on the work of the Space Systems Command at the US Space Force in using deep learning on telescope data to monitor satellite.
My guest is Justin Fletcher, an artificial intelligence and autonomy subject matter expert at Space Systems Command. He has something of a tricky job since it requires expertise in physics, computer science, and military applications. On top of the deep learning skills, how he manages to juggle these competing areas is unclear. So let's interview him to find out. Hi there Justin. Thanks for joining us today. To begin with, I'd just like to find out a little bit about what you do. So you're working for Odyssey Systems. Can you just talk a little bit about what Odyssey Systems does?
Justin Fletcher: Certainly, So Odyssey Systems Consulting is a business. We do advisory and assistant services on behalf of the US government. So I work on a contract called SDA that stands for Space Domain Awareness- SDA Support Systems. So we work on the government's behalf to represent their interest in managing it. We, it's a large portfolio, manage large scale acquisition activities all the way through to research and development. We do a lot of study work and a lot of technical management. So we're helping and advising the government as they build out advanced technology portfolios. For, in our case, what's called space domain awareness, really concerned primarily with what's happening in the space environment
Richie Cotton: And by the US government - Which particular government departments are you working with?
Justin Fletcher: For the contract that I referred to before, that's with Space Systems Command. So SSC, which is part of the United States Space Force. Okay. The field commands within the United States Space Force.
Richie Cotton: All military stuff. Brilliant. So tell me just a little bit about what you do in your role.
Justin Fletcher: So technically I'm a subject matter expert. That's what my job title is, but my role is to lead a small but focused team of multidisciplinaries. So we have several astronomers, some computer vision specialists, a few data scientists, and many software engineers to try to advance the state of the art in space domain awareness, primarily through the application of artificial intelligence and autonomy.
Those are our two sort of major technology focus areas, and so we try to move the state of the art forward in space domain awareness, and what that looks like. A lot of applied computer vision work for scientific imagery as well as we automate telescopes, so close loop autonomy for telescope control so that they're reactive to stuff that's happening in space.
That's sort of my overall area. We have a sort of direct team of six, and then we have an extended team, including performance contractors on behalf of the government of about 40. And I'm responsible for that group.
Team Structure for Space Data Scientists
Richie Cotton: All right. Brilliant. So it's a, it's a pretty substantial team. And you mentioned that in addition to the data scientists they are actual scientists as well So how does that sort of relationship work? How is, how has this team structured them?
Justin Fletcher: Yeah, so that's been one of the most interesting components of this job over the past few years is we, you know, we're out here in Maui, a lot of our team comes from the observatory at the summit and so we have this really interesting interplay between people like me. So I came in as a computer scientist having absolutely no background in Space. I was at the time when I moved here, I was a military officer in the Air Force and I moved out here as a computer scientist and, and was introduced to Space. One of the things that we find real that's really interesting is the cross-pollination of technical ideas.
So we have, you know, this is, this is almost cliche at this point in sort of the maturity of data science as a field, but domain knowledge is really, we found is the essential ingredient in applying these techniques to advance the state of the art in new domains. So like you can't just take a resnet and apply it to an image classification problem for fits. It requires a variety of detailed transformations of that data and that interplay between the technical & spatial. In that domain, think a lot of this is astronomy and optical physicists and things like that for us, and what are more traditional computer science or data science roles has been very interesting.
What's been really compelling is to watch those people grow towards one another. So what we see is a lot of computer scientists picking up instrumental astronomy skills and a lot of astronomers picking up software engineering and data science skills, building and annotating data sets, things like that. So that's been really interesting
Top Skills for a Space Data Scientist
Richie Cotton: So I was fascinated by the kind of domain knowledge that you need to do these things. So can you gimme some examples of the sort of skills like you, you mentioned optical astronomy and things like that. So what does that actually mean? What, what skills you need there?
Justin Fletcher: Yeah. I mean, so for astronomers, a lot of those people have PhDs in astronomy. This is like instrumental astronomies. We have some people who specialize in their PhDs in solar physics. We have, we have a gravitational physicist on our team. He is, his entire PhD in gravitational physics and then switched skills. And what's really interesting is we have a lot from the astronomy and optical physics community, those skills tend to be very research-oriented and very focused on the production of physical hardware in the real world. That has a really interesting synergy. With what we tend to think of as entirely software-oriented skills. You know, when we think about data science and computer science, we tend to think about mostly about the production of software to do things that we want to do well in our domain. We really don't have the luxury of just pretending that they're just gonna come in from somewhere and then we'll figure out how to get it. We, we'll figure out how to clean it and get it ready for processing to train models. It doesn't work that way. We have to think about the physics of the instrument all the way through the trained model. And so that for us is, we see a lot of the skills that are needed for that are instrumental astronomy, building the actual instruments. There are analogies, In people building advanced camera concepts. Really cool stuff happening in MER optics right now with deep learning. And so it's really about having a broad base of skills that is relevant to the domain.
You're trying to apply these data science techniques too. And then of course everybody's gotta know Python. We use primarily TensorFlow, but the team's moving towards Pie Torch now as well. You've gotta build those skills as well, so that's a skill base for us.
Richie Cotton: So you've got like these hardware challenges and you've got challenges with physics and you've got the data science and machine learning challenges as well. I can see why a lot of people have PhDs on your team then. So let's go a little bit more into these data science and machine learning seals. So you said you're doing a lot of work with TensorFlow and Pytorch. So actually you can talk a little bit about like what, what you're doing with, I presume it is a lot of deep learning in this.
Justin Fletcher: Most of our applied work, especially in computer vision, is of course deep learning. We really run the whole gamut. So we have a variety of classification problems. Everything I'm gonna talk about today is in the public literature. So you can find all this stuff. If you go to Google Scholar and look up, look us up, and you'll find all this stuff.
I'm not talking about anything that has No defense secrets today. Yeah, no, no, no. Defense secrets today. That's right. Everything you can find here on Google Scholars. So I'll, I'll give you a few examples of the kinds of things that we're building, right? So one is treating the problem of classifying an object. Identifying an object, right? Say that that thing that I'm pointing at right now is this specific satellite to be really concrete about what I'm talking about right now from a spectral sample. So you point a big telescope at the thing. You use a, you use a spectrograph to split up the light. You get this sort of two-dimensional, blurry image that doesn't really look like anything, and then you apply a convolution neural network.
With a classification head to classify that as the object's identity, right? So the classes correspond like identity. There are not that many, so it's pretty easy to formulate the problem that way. So like traditional classification problems, there's an example of it, but it's got a twist, right? Because you have to actually care about the physics of that imaging process to make that model work.
We got classification, one of our big, like our most widely proliferated. Families of models. It's called SAT net. There's overlap with that name. Now there's other things called SAT net, but we have one called SAT net that does deep space object detection. So this is, this is basically in rate track imagery satellites out in geo or beyond.
If you pointed to telescope Adam, they have a sort of distinct signature on the focal plane at the telescope. Right? And so our objective is to detect those. Well, so what we do there is, originally we did this, when we first started it, it was like faster r cnn, and then we moved on to GOPT3 for a little while, and now we've our most recent state of the arts, deformable. And so we have a variety of models that we apply to these problems. That one is really interesting too because, because we have a, we have to sort of host and serve inferences for this thing at scale and also near edge devices where connectivity might be really poor. We also have to do a lot of ML op stuff, so we have that in addition to the model development work, we also have to retrain, we have to do domain adaptation and retrain and, and of course everything in defense, we have the additional twist of sometimes we work with classified data.
I of course can't talk about any of that today, but what I can say, Presents a deployment challenge that is really unique to our domain. So that's, you know, the kind of techniques that we use are, are traditional computer vision. We actually have a really fun paper coming out soon and we have some, there's already some work published on this using reinforcement learning to basically like, think about this as like controlling the gain of an emerging kind of neuromorphic sensing modality so that we have some for a's into deep reinforcement learning.
Yeah, that's, that pretty much covers, our deep learning portfolio. And then on the autonomy side, which we haven't really talked a lot about, so. On the autonomy side, we have a whole portfolio dedicated to multi-agent, globally distributed autonomy for, in that case, telescope control. So that's from a brief overview of the domain.
Classification Problem Explained
Richie Cotton: Okay. Wow. So there's a lot to go over there and I'd love to get into some of these in, in more detail. I hope I'm not butchering this to say too much, but it sounds like a lot of what you do is like a telescope takes a photo of the sky and then it's is, is that little spec there a satellite or a star or just like a spec of dirt on the London? This is where the, like the classification comes in. Is that sort of the right angle?
Justin Fletcher: No, that's, That's more or less, right? I mean, we have, we have these, we call 'em ground-based optical systems or telescopes, right? Variety of sizes, a variety of costs. They go down to 50K all the way up to several tens of millions of dollars, and they point at the night sky. They do have different instruments, so it doesn't always look like dots and streaks, but they do have different instruments and you control their different ways. But yeah, that description is more or less correct. We point these big ground-based optical systems, that stuff in the night sky, and then, We get the data back, which comes in as scientific imagery.
So it, it looks, it's very analogous actually to medical imagery. If you've looked at like the computer vision literature for medical imagery, they look very similar in terms of how the data is actually presented. And then, we then train models to do information extraction tasks. From that data. Yeah, that's pretty much it.
Goals of a Space Domain Awareness Team
Richie Cotton: So there are a lot of exciting things you're working on. But before we get into the detail of like specific challenges, I'd like to take a step back and just think about what the goals are of your team. So you mentioned that the field is called space domain awareness, and I'd like to know a bit more about what does that mean and what is it you're trying to achieve?
Justin Fletcher: So our objective in space domain awareness is to discover what is present in space and to track those objects across time. There's a really reductive definition because there's all kinds of additional dimensions to this problem. You might wanna characterize objects and think about what they're, if they're changing across time in some way that you can't directly observe, but the general objective of the field.
Is to keep track of what is happening in space in order to enable, For example, you know, space varying nations have to send people into space. You have to keep those people safe. Commercial entities want to do business in space. You have to ensure that there, those satellites can operate in that domain, or at least provide the information to the world about what is happening.
And so that all falls under the very broad application domain of space domain awareness. So that's our overarching goal, is to provide the US government with knowledge about what is occurring in
Planetary Defence: Is that real?
Richie Cotton: I've watched too many, like Bruce will movies. But imagine that, oh yeah, there's this new spec there and we got a, an asteroid hurling towards the earth. But it sounds like it's more about checking the integrity of existing satellites just to make sure that they're still functioning. Is that sort of correct?
Justin Fletcher: Yeah, it's like that. So that Bruce Wills reference was apt. I've actually met some people from NASA Planetary Defense. There is a group of people and they have the coolest job title as far as I've determined in the United States government.
They do planetary defense stuff that they do that, you know, tracking asteroids and stuff like that. We actually use asteroids occasionally as like reference targets when we're imaging things to try to understand the science of our data exploitation techniques. That happens occasionally. But yeah, we're mostly focused on what are called manmade resident space objects.
Sometimes called RSOs is the acronym for them. And these are sat, that's, you know, satellites, space debris, used rocket bodies that aren't really doing anything up there. So when I was talking about characterization before, that's a little bit of jargon for what you described, like health and status, how's the satellite doing?
Right? Is it still up and running? Is it if, if we're no longer hearing from it, is it. Still stable. Is it dangerous? Has it exploded? Things like that. So that's the primary focus of space, domain awareness. But we also have to care about debris. Lethal debris can can destroy entire satellites. We have debris, clouds that can make entire regions of the sky inoperable.
And one dimension we haven't talked about yet is really the satellite owner. Operators have to have this sda, this space domain awareness knowledge. In order to decide what to do with their satellite, the most critical thing we do is do what's called conjunction analysis. This is where we make sure two satellites, Oh, it looks like these two might come into contact with one another.
You two should probably do something to make sure you don't, cuz nobody wants to lose a satellite. Right. And then debris warning. So, you know, those are really important functions that are the the data that we exploit in form.
Richie Cotton: I mean, you sort of think of spaces being kind of big and so the chance of collisions are fairly low, but I guess there's a lot of things in space now. And so maybe the chance of collision is, is, is bigger than you imagine.
Justin Fletcher: Yeah, it's, It's a little counterintuitive because yeah, you would assume that there's plenty of space in space. There's no probability particularly collide. Well, it turns out that we only know the orbital parameters with a certain level of accuracy, which is actually a function of how well we exploit the very data I was talking about before because those detections inform our ability.
To do what's called correlation of the orbits to get the orbital parameters and predict where they're gonna go in the future. There's a lot of uncertainty in that, especially for small objects that have like solar radiation pressure effects. And so while it's true that there's not a lot of stuff and there's plenty of space, the problem is cuz you don't know where it's going.
There's a sort of cone of uncertainty around where an object might be in the future. You have to take that into consideration and do risk reduction, potentially maneuvering your satellite in the off chance that it will hit you. And so if you think about these objects existing in probability space, they spread out across time, right?
So they, they're really small and there's a lot of space except that because we don't know where they're going, they are in effect. Larger, which is a really interesting way to think about the problem, and if you have better information, that's why we do computer vision for these problems, right? If you have better information that those objects become in effect, smaller, which is interesting, I think.
Richie Cotton: Yeah, absolutely. So maybe let's get back to the data you're working with. You said you work with a lot of image data, so how big sort of data sets are we talking about?
Justin Fletcher: Sure we have a lot of problem domain, so it's, it's imagined these as like we have each problem. Domain has a different, think about it as a benchmark data set that we build, and it varies from problem to problem and maturity. From level of maturity, right? So we have r and d projects that have 500 frames, so we're just getting started and we got maybe a few gigabytes of data, right?
Richie Cotton: So that's kind of small data problems rather than big data problems in that case.
Justin Fletcher: In that case, Yeah, we're, And that's like R&D, right? So we're just seeing if we can build a model that does the information transformation we care about, right?
Can we just map from the data shape to the data shape and not worry so much about performance? Just let's just see if we can build something that works end to end, that's on the small end. And then on the large end, we have one data set that's the one for that detection problem I was talking about before. That now has well over a million annotated images in it, and these are fairly large images. They're 16-bit images and. Usually five 12 by five 12 to 10 24 by 10 24. They're very large format images. And so that data, I have to be honest with you, I don't even know how large it is. It's probably several hundred gigabytes at this point across, and that's across multiple different cameras matters a lot what camera, the data comes from in our domain.
It's not like it's, they're not homogeneous like an iPhone camera or natural imagery tends to be. So it matters a lot what the camera is. So we have those from a variety of different telescopes all over the world at different altitudes and with different quality cameras.
Richie Cotton: It sounds like you've got lots of different teams working on different problems then, and so this sort of strikes me as a slightly academic environment. Is that sort of correct? Is there an academic feel?
Justin Fletcher: Sometimes. So one of the things that we really pride ourselves in with how we have constructed this team is that we do really like full spectrum development. We do full spectrum development. So we go all the way from low technology readiness level things, and that is very much an academic environment.
So this is like, we are not quite. Basic research, so we're not doing like proof of principle stuff, but really early applied research. These things tend to terminate in a peer reviewed publication. That tends to be like the target success criteria for those things all the way through the fielding of operational solutions.
So think about like deploying model to live ops to inform space domain awareness, decision makers. About problems in space. So we do that whole thing end to end and the level of academic, if you will, varies across the portfolio depending on what you're working on. And so one researcher might that we've had this happen might take a research project all the way from effectively ideation.
So no basic research cuz we people already invented com nets and people already invented spectral imagery, but all the way from ideation through a fielded concept operating on an operational telescope for the space force. So it depends on which part of the life cycle maturity you're on and what you're working on that day.
A Career in Space Data Research
Richie Cotton: I know a lot of people are listening, like trying to work out can they get a job in data. So first of all, how do you get into your team? And then, yeah, I'm curious as to more generally, like how do people go out becoming space data researchers?
Justin Fletcher: Sure. So there's two answers in two different directions. For, in my case, I really had some incredible opportunities. I was a active officer in the Air Force for many years, and I, I moved to space domain awareness as my third assignment. So I came out to Maui, not a bad gig. As far as assignments go, there's definitely worse places to be assigned. So I came out to Maui and I was the program manager for a autonomy program.
Had some really incredible opportunities with the Air Force Research Lab there. So that's how I got to this location. My background is in computer science. I was not a space person initially, so I have sort of on the job trained into that domain. If you are interested in going to the more general answer, You know what paths can in practice at, Can I follow to work in this kind of space?
There's a variety. So it depends on where you're physically located though, Less so than it used to. We live in a remote work world now, but, And what stage of your career you're at? A great thing to do is to reach out to, to look at job Rex for defense contractors working in this space. That's a really accessible way to approach working in this area.
Every contractor on my team is hiring. So that's a great way to get into the field is go and look for job wrecks in defense. They're all over the internet. Look at the major defense crimes. I won't endorse any of them in particular, but go and look at the major defense primes and look for their space related job wrecks.
That's a good way to get into the field if you are a PhD student today. If, if you're, if that's where you are in your journey. You could look at potentially doing a fellowship with potentially the Air Force Research Laboratory, so that there's a great scholars program that we, every year we host several scholars out at our group.
These are usually PhD students who are interested in potentially going into this field. We have hired several people outta that program. Some have gone on to be civil servants in the Air Force Research laboratory. And so there's just, there's a lot of paths. It really depends on where you are and what your skillset is.
If you, for example, today are a budding web developer, and you are interested in how can I apply my skills to do problems in this domain, you could look at one of the mini software factories around the Department of Defense, just Google defense software, factories, A dozen all come up if you wanna work in space, there's one in Colorado Springs.
So there's just a variety of ways to get in and we're, we're hiring the business, looks good for the foreseeable future. Um, there are a lot of problems we have to solve. Space. And you know, I gotta tell you as far as. Reward for the work goes. A lot of places are hiring right now. You go get a job in Silicon Valley there.
There are places that will hire people, but we offer, I think what working in the Department of Defense offers is really compelling problems and huge, wide open frontiers. These are not problems that have been optimized over these application areas have not been optimized over decades to maximize click through or something, right?
Like these are new problem domains and you can come and homestead here if you want to. It's a really, really inviting environment. So, my contact information I think will be associated with this. I encourage people to reach out to me.
Data Science Skills Used
Richie Cotton: Wonderful. Yeah, so it sounds like business is booming, but um, you said you came through a sort of computer science background into the military and then the sort of space stuff came later, but it's also possible to go in other directions.
So maybe you come from space science first and then get their military application later for people who are sort of interested in, uh, Data science background, it sounds like Python, and then some deep learning skills are sort of the way forward. Is, is that sort of true of everyone in your team or are there other kind of data science skills that people use?
Justin Fletcher: Yeah. The primary skills that you need are infrastructure as well as Python and at least one deep learning framework. So we're, we're moving towards framework agnosticity. That's not as easy as it sounds in plain English. It takes a lot of work to support multiple deployment frameworks, but for the most part, we're not gonna be picky about what kind of framework you bring to the table.
I personally, am an old school TensorFlow graph mode kind of guy. So if. Tester photograph mode, that's fine. If you are Pi torch, that's fine. Obviously Python is de lingo Franco, you've gotta speak Python. So that's, that's the requirement. And then it's really expected at this point in technical maturity, if you're gonna be working on one of these teams.
Unless you are literally in a pure research role. And that's for like people who are graduate students coming into the program. Unless you're in a pure research role, you have to speak containerization. You have to, you have to know how to construct a, a container image and, and produce them a run time.
You gotta know at least the basics of Docker and Kubernetes. Helm that technology. You also need to be generally aware of data intensive application systems. So you have to know about databases and stuff. No need to be an expert in it, but you gotta know how to write a sequel query and yeah, stuff like that.
Transitioning from TensorFlow to PyTorch
Richie Cotton: So I've to say I love the fact that you described yourself as being old school tensor Flow . So since it's a fairly recent technology still, Yeah. But you mentioned how your team sort of transitioning from TensorFlow to PyTorch. Can you tell me a bit about why you decided to do that?
Justin Fletcher: Well, to be honest, it was a grassroots movement, so it wasn't really something like I decided and told the team, Yeah, we're gonna go. We're gonna start moving to Pie Torch. It was occasionally, it really started, Remember before I talked about the spectrum of technical maturity? It really started on the low maturity into that spectrum. People are doing research projects and wanted to prototype quickly. Because it's just, it's, it's much more approachable.
The data management is a little bit easier if you don't have to have two rigorous or high performance data pipelines, so like data set generator pipelines and stuff like that. One of our researchers, it was that same PhD in gravitational physics I was talking about before, was new to the domain. Very competent developer though, and just said, Listen, I, it's gonna take me a lot of work to do this tensor flow.
Let me just do it in pie torch. Is that all right? And sure, yeah, go for it. And it worked out fine. Nothing bad happened, right? So it's sort of a grassroots movement. There's another couple of developers who have moved over as well, and we don't really see any need to enforce a particular framework. It doesn't seem to be that important to enforce a particular framework.
The only thing that we have to enforce is the interfaces to inference time models. So whether you build in pi do or you build in TensorFlow, you have to present a standard API at Runtime, and it preferably needs to be externally legible. We like those represented with Open APIs so that we have the ability to interface with those models.
At Runtime, we find that we are very rarely inference time performance constrain. So we don't really even care that much about inference time performance. That's just almost never the problem. Our images take a long time to take. They can take 30 seconds to take an image sometimes, and so that's just not the problem.
The problem is the software engineering dimension of fielding and deploying and sustaining and retraining and, you know, uh, model.
Richie Cotton: So this is when other people try and use your model for predictions. That's when the standardization matters, not when you're actually trying to figure out what should the model contain when you're training it. That sort of thing?
Justin Fletcher: Precisely. Yeah. It doesn't matter that much at training time. It matters a lot when you're running inference models.
Richie Cotton: All right, so we talked a little bit about the technical skills. Now I imagine since you've got some pretty serious deep learning going on, you've got the physics, you've got all the other kind of space knowledge and things like that, that it must be fairly challenging to try and communicate what you're doing in your results to other teams, particularly to non-technical people. So how do you see like communication with the outside?
Justin Fletcher: Yeah, so I, I used to spend a lot of my days building stuff. Now I spend most of my days talking about stuff, and so I, I at the sort of pointy into the spear for this communication problem and perhaps a unpopular opinion here, but certainly in the Department of Defense, unpopular opinion. When I'm talking to non-tech audiences, even general officers, sess, and these are like senior civilian executives, et cetera. Yeah. I give the same talk that I do to technical a. Exactly the same talk. One of my principles of communication is I never talk about artificial intelligence in the abstract because people will always fill in the gaps with what they think that means, and that is an enormous diversity of things.
So I give exactly the same deck of charts, whether I'm talking to a four star general, or I'm talking to a, a new developer on our team to introduce the work. So for me, communi. About these problems is really about specificity and relevance. So you have to be specific enough that people know what you're talking about.
You have to put it in terms that are physically comprehensible. I am detecting an object when I detect that object. What I mean is that it is at this location relative to the celestial background, right? Like you have to be able to communicate those things and you also have to be able to represent that in a.
That is comprehensible to them. I do not believe in talking about these things in the abstract because it becomes very challenging to do that. So I give exactly the same deck with the same papers side at the bottom, same nerdy papers side at the bottom to every audience. And that has worked fine for years.
So, so, so far nobody's made me stop. So I guess that's, you know, whether or not that's a good idea. The material, that's what I'm doing. So yeah, it seems to work out.
Use Cases of Specific Vs Abstract
Richie Cotton: That's kind of interesting. I really like that idea of being specific about what you're talking about. Can you give maybe some examples of how that talking in the abstract versus talking specifically is gonna work?
Justin Fletcher: Sure. So if I say, Hey, I've got an artificial intelligence model, which at that point I've already been a little bit, but if I, but if I were to say abstractly, I've got an artificial intelligence model and it's gonna do satellite positive ID for you. So what I've done there is I've used like three different abstractions in that, in that very general sentence, about three different things all at the same time.
And any given listener might fill in those gaps with three different levels of interpretation. Right? And so that's not really a helpful way to formulate the problem, even though in general there's, make sure it's relatable to the people. That's not really a helpful way to formulate the problem. But if instead I say, Bear with me.
I'm gonna take a minute to talk about this. I only need a minute of your time. If instead I say we collect images from a large telescope, we take the light, it goes into the telescope, we break it up into its constituent wavelength components that produces this blurry image that kind of tells you what the relative contributions of different wavelengths are in the image from this object that we took.
And we know that different materials absorb and reflect different wave lengths differently. So we think we hypothesize that it should be possible. To tell what the object is based on its material composition, which you can roughly infer from the image, but we don't actually know how to write down a set of rules to do that.
So we learn that process via a large parameterized model. If I say that, I've so far found no one who can't understand that. Right. Like they might know what does, What does a large parameterized model mean? Well, they might not know that, but they don't really care that much, right? Because I've given them something concrete enough that they can fit all of that in their brain at the same time.
I think that sometimes, especially for senior leaders, We expect a little bit too. Little of them, we think that, oh, you gotta give 'em like, you know, crayon level diagrams. That's not true for the most part. As long as you're brief but specific, they will understand what you mean and they will walk away from that with a very specific understanding of what it is that we can do and what it is that we can't do.
And where I've seen communication in the abstract about artificial intelligence fail multiple times in different organizations that don't have anything to do with one another is when. Senior leaders make the assumption that a technology is much more mature than it is, and then they go and make investment decisions based on that misunderstanding of maturity.
So that's why I'm almost fanatical about talking about things in concrete terms, making sure people understand what the limitations are before .
Publishing Research in the Army
Richie Cotton: Well, certainly before they start spending any money that seems. , very helpful. So just while we're talking about communication, one thing you've mentioned a few times is that your team publishes a lot of scientific papers in peer review journals, and I guess my perception of the military is that in general it's quite secretive, so it surprised me a little bit that you're actually publishing a lot of your results. Can you tell me a bit about how that came about?
Justin Fletcher: Certainly. So that was a deliberate decision. And by the way, the, the military does actually publish a lot. If you look at, for example, the Air Force Office of Scientific Research, right? AFOSR, it's called, they fund grants all around the country. Some of the most pioneering work to this day in sequential decision making on our uncertainty and area.
I study personally. You go to the bottom of those papers, you're almost always gonna find an AFOSR grant number on those things. So it just sort of depends. The public doesn't, in general know when they see that acronym at the bottom, they don't necessarily know that that means, Oh, that was the military that funded that grant.
But in the Navy, of course, as a similar pro department, the Navy has a similar program, as does the Army. There's actually a lot of defense publications out there. But you're right, it's somewhat unusual to see the level of publication that we do from an organization like where we are, Right? So in Space Systems Command at this particular location, this tends to be a lab oriented thing or an Office of Scientific research oriented thing.
As a general rule, it was a deliberate decision. So the first thing that we realized is we're working in emerging technology. We are not doing deep applied technology that like constitutes trade secrets. We are taking things in the public literature and we are applying them to publicly available or data that we can easily make publicly available.
That's not sensitive, right? We, what we do is we take those two basis of justification for being able to share this information. We put them together and say, We're using publicly available techniques. Combats are not secrets, right? And we are using data that is not intrinsically sensitive and putting them together.
It should be okay to publish this stuff. The reason that we do that, and sometimes this is, you know, it is somewhat onerous to go through the publication process where you gotta be careful not to over publish. It can become all you do. The reason that we make that choice is because we are really trying to incentivize a broad and self sustaining base of researchers in this applied technology area.
So we want people doing their PhD students on this work. And the reason for that is in part selfish, right? So what we really want our job is to act in the best interest of government, right? And so it's in the government's best interest. If a bunch of PhD students dedicate some of the most productive years of their.
To solving some of the hardest problems on behalf of the nation that is in the government's interest. And then those students can go on and get jobs in this domain, and it's a, it's a virtuous cycle. So that's part of it is trying to frankly, generate work for free so we don't have to enter into a contractual relationship to make that happen.
That'll just happen naturally. However, there is also the dimension of it is very easy to fool yourself in this domain. It is really easy to fool yourself when you're doing applied data science. When you're training deep learning models, especially in domains where not a lot of people are working, it is very easy to fool yourself.
And so that necessary step of peer review from in particular, cause we don't just publish in artificial intelligence forums, right? For that peer review in particular from the astronomy community and then the defense computer vision community, those peer review opportunities that come from publishing those areas are things we actually can't get inside the Department of Defense.
There's really no one who could do that for us, which is sort of a paradox of being out towards the leading edge I think. It's gonna happen in all kinds of domains all over the world for different industries and different disciplines, is if you're out at the cutting edge, there's not a lot of people who can peer review you.
We were very worried about fooling ourselves in the early part of this process, and so that's what took us down this path. And then what we discovered after we did that, those were the two reasons we started the path. What we discovered is it makes it possible to do things like I'm doing right now. So this is kind of hard to do, go going on a podcast and talking in public if you don't have a track record of publication.
But because we do, it makes it possible for us to go and communicate about our problems. And I think especially given that we have a brand new branch of service here in the United States Space Force, which frankly the. It's difficult for the public to comprehend exactly what the underlying technical problems that Space Force has to solve is cuz people are busy and they're not gonna, not everybody has time to go and study this stuff.
Being able to talk about this in public has its own dimension of value to it. So that has sustained us now that we've got the cycle going and once you get started, it's, it's easier to keep going than it is. I'm not. So if you're out there and you're working in the Department of Defense in Applied Technology, my advice is publish.
Examples of Collaborations
Richie Cotton: All right. Brilliant. Yeah, that's a good sales pitch for publishing your work. Uh, it's brilliant. It sounds like because of this, that your team's actually like very collaborative with other organizations, certainly around the us, maybe around the world. Are you able to talk about examples of where you've collaborated with other organizations?
Justin Fletcher: So first of all, we, we fund PhD students sometimes and that has produced some really valuable collaborations with labs around the country. So that tends to happen through contractual means. But these, at the end of the day, it ends up being one team. And so we funded some PhD students at, we have one at Stanford.
Um, did some really amazing work in basically small satellite stuff. I'm not gonna go into the details right now. So that's one way that we've done partnerships In the past, we've had a variety of academic institutions, us and international academic institu. Raise their hand through. We have these sort of, they're like public space war games, is the way to describe it.
They're like places to everybody to get together and try out their stuff, right? We participated in those very heavily and through those activities, one of the things we did one year was we just said, Hey, could, could people just send us. Fits. That's the image format that for the telescopes take. Can you just send us your fits and see what happens?
And we got like 700,000 images from that. Right? So, and there's five or six organizations involved there. So we collaborate with them, we'll annotate their data for them. We have a great annotation as a service company that's on sub to one of our prime contractors. They're, they're called enabled intelligence.
So they go out, they annotate our data for us. They can do multiple. Vacation levels, and then we give that as a collaboration, right? We give that back to the people who gave us the fits, so now they have training quality annotated data in exchange for letting us use it to train our models. That's a collaboration model that works really well.
We are really just beginning to get into the allied partner. This is an international community, basically collaboration game. So in coming up in November, there's another one of these things. Then we've got a partner from Chile that has raised their hand and wants to participate with us. We were gonna be providing autonomy for their telescope, I think.
And then another partner in Australia who has a emerging sensing concept that we're very interested in, we'd like to participate with. So that's how we tend to do collaboration. But we've also had, you know, I talked before about funded PhD students. We've also had students at universities just see the work and say, Oh, this is relevant and interesting to my.
Would you like to collaborate with me and We'll, those students reach out to us, like we'll give them the data. We can give them most of the data. Some of it is sensitive, some of it is even classified. Obviously we can't give them that, but we give them all the data that we can. We'll, an if they've got data, we'll annotate it for 'em and give it back.
We'll train models and give them baseline performance. It's a really interesting collaborative community
Richie Cotton: That seems pretty incredible that you are collaborating with a lot of organizations around the world. I just wanna pick up on something you said that you provide an annotation service and Images of like cadets, just having to lay , go through labeling, telescope images with like, Oh, this dots this particular star, and that's satellite. Can you just tell me a bit about what annotation involve?
Justin Fletcher: Yeah, sure. So, I mean, there's a fun story here. The first sat that dataset, the one I was talking about before, it wasn't annotated by a cadet, it was annotated by me. I annotated that dataset many years ago. But no, the, the annotation is we've matured a lot over the years. We have like tight instrumentation on dollar per frame, dollar per second, all that stuff on our, we have a very mature annotation pipeline. Now, what it entail, For, I just give you one problem, so every problem's different.
I'll just talk about one. The most mature one is the SAT net problem. We have a company, our, one of our prime contractors, has a sub relationship with them, a contractual relationship with them, and what they do is we have built a tool for them called silt, the space image labeling tool, which has a nice like pun on fertility as well, you know, all the data grows out of it, but SILT it has been, was built by the prime contractor and has been fielded to them as their annotation.
Right. And so what they do, it brings these images up for them and there's all kinds of domain specific stuff you gotta do, right? The, when you look at these 16 bit images, if you just try to display them on a computer screen, they don't look like anything. So all those knobs are tuneable to the annotators.
These annotators do this professionally, full time. All those knobs are tuneable and they can play back and forth through the images. Extremely important. They're not true videos cuz they might be taking minutes. But they're not move. The objects don't move a lot, So you can basically track them if you watch across time, and you can even imput the location of objects with the eye, you can imput the location of objects.
If they're not visible in the frame, sometimes they get too dim because the sun's not shining on 'em just right. They disappear, but you know they're there. So you can annotate that underlying source of truth. And what it looks like is they're going through and basically connecting. These pacio temporal lines to produce the location of the objects across time.
And then actually we just fielded a brand new feature. We have an excellent new developer who's just joined us recently, and so she has built an extension to SILT that makes it possible for us to also annotate the stars in the image. And to, this is a bit down in the weeds, but basically if you have an, if you know where enough stars are in an image, basically think about you're looking at the night sky, a little piece of it.
There's some satellites in there and a bunch of stars in the background. If you know where enough stars. You can do what's called an astro fit. This is where you basically, you can determine where that telescope was pointing based on what it sees in the background of what it's imaging. This is essential for that orbit characterization stuff I was talking about before you have to do it or the observation is useless.
We can now even go through and dynamically annotate those stars and it will, it'll run basically an geometric fit engine in the background. And if it doesn't get one, it'll tell the annotator, Nope, that didn't work. We need more from ya. They'll keep annotating stars until it satisfies that require. Which is really powerful because right now that you have to both do the object detection and the star detection and fit successfully with high precision and recall in order to have a valid observation.
It's not enough just to say that object is there. You have to say that object is there and there is here. Here is the location of the night sky that this telescope was looking at. Ah, okay.
Interesting Projects worth Mentioning
Richie Cotton: That kind of makes sense. Just knowing that there's a satellite somewhere in the sky is less helpful than knowing where it is relative to the rest of space, I guess. So it's really interesting hearing about the problems you've been working on. So can you tell us what's the project that you've been most proud of working on?
Justin Fletcher: So, I mean it's, it's hard to pick one. It's sort of like choosing amongst your kids your favorite, and whoever I choose, someone else will be hurt. There are a lot of projects that have been really fun. I think that. It's, it's a tossup between SAT and at the problem I've talked about many times it's detection problem cuz that one's really close to me because I started it personally, it was my nights and weekends project right before I, before I just did management stuff all the time.
That one was really, really close. And also we had a great, in the early days of that program, we had a great collaborative environment. Some of my colleagues who were out here at the time, we really, you know, it was very much a startup kind of atmosphere at the time. Nights and weekends on a shoestring.
Building out this framework for the dod. Anyway, that project is a, is a favorite of mine, but it also has grown into a very mature operational capable product. But really, I had to say the, The one that I've invested the most in really haven't talked about almost at all today, and that I think has the most potential for disruption in the department and I think is sort of, For that reason, the project of which I'm most proud is this project is called maa.
That's an acronym. I'm not gonna go through the torturous acronym right now. But basically this is a global autonomy program. It is designed to make it possible to get the right data from telescopes. So far, we've only been talking about. What do you do once you get the data? Well, in a, in a sensor poor target rich environment like space where there's too many things to look at, not enough sensors, it's not enough just to exploit the data correctly.
You also have to collect the right data, and that itself is a really challenging sequential decision making, under uncertainty problem. What does it mean to do a good job? Right? We basically have a program for that called maa, and this is a really, I think is a novel programming para. It's a very new way to approach the problem of multi-agent autonomy.
It's built entirely on Kubernetes, and we have a really interesting deployment construct for how that system will be fielded and sustained across time, updated and managed, and it has, as part of it, it's a closely loop agent, so it has these computer vision approaches inside of it. Effectively. It's a thing that we transition these technologies into.
That project is enormous in scope, and if it succeeds, we're in the early stages now in. This upcoming exercise I was talking about in November is gonna be a major test for that program is gonna be really its first major milestone. If we succeed at it, it could totally change the way we do sensor management in the Space force. That's the, I think those are the two projects I'm most proud of. I didn't, I wasn't able to pick just one, I guess. Sorry.
Richie Cotton: Your two favorite children. That's not bad. All right, so that last project seemed pretty interesting cuz it sounds like you've got the computer vision aspect, but you've also got this automation stuff and just, it sounds like it's getting to the point where it's artificial intelligence where it's trying to just take the human of the loop completely. So just every, all the telescopes are run automatically. Am I understanding that correctly?
Justin Fletcher: Yeah, so they do run automatically already. All these telescopes are already robotic telescopes, so they're, so the, they're automated in the sense that you tell them to point at a location in the sky and they'll basically do that for you.
What that program provides, in addition to that, is what we might call orchestration. Think about this as we have to use these sensors. In a collaborative way if, if these sensors are very different. Imagine one has a really wide field of, so we can see a lot of stuff, but not very di stuff. Another one has a really narrow field of use, so it can't see very much area, but it can see really, really dim stuff in the little area.
It can see, you can put those two sensors together and that's the hard part, right? But you can make those two sensors interoperate, collaborate with one. In such a way that the autonomous closed loop system they constitute together, working together is more than the sum of its parts. And there's some really interesting, you know, this really, I, I can't claim credit for that idea.
That idea goes back to, there was a DARPA program about this in the sixties. The site here, Amos, the site that we're at out in Maui, has been researching this for years. That was actually that program that I was on when I first came over as active duty where I can't say a lot about it in this forum, but Basical.
Was doing similar work. That paradigm of extracting more than the sum of their parts value from a collection of heterogeneous telescopes working together is, has the potential to be game changing for not just space domain awareness, but sensing in general. And of course, these are not original ideas, right?
These, these ideas have been studied since the eighties and nineties, and actually interesting connection back to a OSR has funded a lot of that stuff. So again, the military. Funding research and development for those kinds of problems.
Impact of Deep Learning on Research
Richie Cotton: That's really interesting. And so you mentioned some of these programs are going back a really long way that sort of predates deep learning. So how would you say, um, as deep learning affected what your teams can do in terms of research?
Justin Fletcher: Yeah, so this has been, it's been really fun over the past few years, so, We're coming into a community, but we, we, when I say we haven't really talking about basically me here. When this first started, it was really just me.
And so I was coming into a community that was venerated had been around for generations. People have been doing space domain awareness since decades before. I was born. Been doing space domain awareness since they called it SSA at the time, but since my parents were. Right? And so, so one sort of gets that, what are these kids up to these days?
kind of attitude when you say, Yeah, I'm gonna do computer vision for this detection problem you've been doing for 50 years. Right. Why we've been working on it for 50 years. What do you have to add? Right? It's not necessarily that people are malicious or trying to prevent progress or anything, it's just that they know this domain.
They've been working in this domain an entire career, and they know it, right? And so their expectations are very hot. Here's what we discovered when we brought the deep learning revolution to space domain awareness by accident, cuz that's what I did in my graduate research. That's what I wanted to do in my day job, so I did it as a nice and weekends project.
What we found was when we actually took the time to measure the legacy systems in terms of the way we tend to measure in particular detection problems, in terms of precision and recall, right at a particular. Iou really. We actually use centrally distance cuz our objects are point sources, but that's, that's not in the weeds when we think about formulating those detection problems in those information retrieval terms as opposed to detection theory, which is much more common in the astronomy community.
What we found was when we put it in those terms, which are really the terms we need for autonomous systems. Now, explain why in a moment. What we find is that the legacy systems were not as performant as we thought they were. This wasn't cuz anybody did anything wrong. It's just because as anybody who worked in computer vision before, the deep learning revolution will tell you computer vision is really hard.
So yeah, hand design computer vision is harder yet. And so there were challenges and in particular, there were edge cases that caused some. Truly egregious failure modes, and we uncovered those only because we systematically measured our performance so that we could do deep learning. You can't do deep, You can't be serious about doing deep learning if you don't have really carefully controlled and well understood metrics for your problems.
Because otherwise, going back to earlier, it's really easy to fool yourself. And so if you don't do that, you can get in all kinds of trouble. And the trouble that the legacy community was getting into is that they were assuming that ground truth couldn't really be known. And so they were doing their best to calculate their detection forms based on case True.
So when you actually knew an object was there, and that caused all kinds of downstream failure modes with precision. So we are getting really low precision on certain kinds of images, on really important kinds of images that you really don't wanna have a lot of, a lot of false positives on. And so that be, get this whole program.
Richie Cotton: So does that mean like a false positive means? Do you think that there's a satellite there but there isn't?
Justin Fletcher: Precisely. Yeah. So a false positive in this case would be hallucinating the existence of a satellite. And so that can happen if the model is trained to be really high. Recall, if the legacy system has been hand tuned to be really high recall, which means if I say an object is there, it's definitely there.
Right? And so if you only care about Case true. You don't care about case false and this, this transcends domains. Everybody can have this problem, right? It's not just space domain awareness. If you only measure one dimension of the problem, you can get truly aran behavior in other dimensions. And that did happen.
We have demonstrated that numerically, experimentally, I should say, on sky in the real world many times over the past few years. But it was, it was slow going initially cuz we were coming into a very well established community.
Richie Cotton: Sounds like there really has been a, a deep learning revolution then, or at least there, there's a deep learning revolution ongoing then throughout the field?
Justin Fletcher: I would describe it as ongoing. So just to be clear, some of these techniques, these non learned techniques work exceptionally well. We, to this day, do not have, I talked about as geometric fits earlier, telling where you are in the night sky. We don't have a learned solution for that.
The best solution is the one that was hand designed. So the revolution might not conquer every city. You know, like some technical disciplines might actually be done best in a hand designed way that's very explainable and has really well instrumented performance. But there is an ongoing revolution This week, this very week that we're recording, this is Amos conference out here in Maui.
It is the conference in space domain awareness, and I don't think anybody would be offended to hear me say that. I think it is the biggest conference. It's gonna be like 1200 people here this year. The whole community will be, And I've been coming to this conference since there were no deep learning talks.
It was like one or two deep learning talks across the whole conference to now I, I haven't quite counted this year, but last year there were dozens applying deep learning to these problems and not all from our group. Some, many of them were, almost all of them were independent of us. So it really does feel like we're getting that. Revolutionary wave that the rest of the world had in 13, 14, 15, now in 19 20, 21, 22 in our application area.
Biggest Challenge in Space Research
Richie Cotton: Okay. So it sounds it's gradually becoming a lot more popular, but not quite there throughout all the teams yet, perhaps. All right, So you've talked a little bit about your wins. Can you tell me a bit about what's the biggest sort of challenge that you are facing at the moment?
Justin Fletcher: There are a variety of dimensions of our applicationary that make it difficult. Some of them really aren't exciting. Our biggest challenges are often, frankly, institutional and bureaucratic. No, you can't deploy that software to that host is a very common problem and like, so getting to yes on questions like that consumes easily 60% of our program. Which I think that's probably a fairly numerically accurate estimate. So big problems in the Department of Defense relate to digital infrastructure and the ability to field these things. Our solution to that is to build our own and certify and actually go through the process that takes time and is laborious, but we're doing it now motivated by the larger scale of deployments that is being demanded.
By the modest successes that we have had so far in computer vision and so for example, the existing cybersecurity process, even with a continuous authority to operate a, what's called an ato, even under those circumstances, if I'm retraining a model every hour or so, because I'm trying to keep up data as it's flowing in at a very fast rate, there's no viable path.
To do that deployment action today, That is a huge challenge. Context is a huge challenge. How, where do I put this model to run inference? How do I deploy this to this classified system? Those are always our primary roadblocks. We also have, you know, rearing its head again as sensor sparsity, right? So we have a high demand, low density.
Sensor network. We don't have a lot of these things. They're very expensive, especially the large aperture telescopes with these exquisite instruments on them. And they're expensive to maintain and to operate. We have access to all of them, which is an incredible benefit for our research. But the problem is that they can sometimes be very difficult to work with.
You don't get they, they get pulled away for higher party missions necessary, but it does happen and it does disrupt research and development. And so we have a problem with data sparsity, especially in our more. Domains that set net problem. We've got such a large number of collaborators now. We really don't have a data sparsity problem there, but in the rest of the domains, we very often find ourselves bootstrapping with simulated data, and you can never fully eliminate the sim to real gap.
Even with domain adaptation, it's never the same as having the same number of images from the real domain. And so that. Proves a persistent challenge for us. So there's infrastructure and then there's access to data collecting devices, which have a very different flavor than the kinds of problems that I think most people see when they think about doing computer vision.
In industry, for example, most people are processing natural imagery to do information extraction tasks. They're not thinking about, what if you only got 27 of those images a week, Right? So how, how would you approach that problem? Of course, people are working on data efficiency, but at that scale, in this case, small scale, that can be real, that can be a, basically a performance limiting challenge. So those are the two I think of off the top of my head. Okay. All
Richie Cotton: Right, thank you. So before we wrap up, I have a slightly silly question to ask. So just in the past, like I've worked with like biologists working in the lab and I've worked with engineers who get to blow stuff up and it's always been me there as a data scientist, just like sat on my laptop, analyze the data.
So I have this kind of hardware en. Now, I know you're based in, in Maui and Hawaii, and so you have these like giant telescopes nearby afterwards. Like do you get to play with them yourself? Do you get to play on telescope?
Justin Fletcher: This is one of the cool, like I, I joked before about like, oh, you can maximize click through. That's something you could do with your life. Or you could operate one of the world's largest telescopes at the summit of Poly Law in Maui. Right? Like, so I think we have a compelling case for people who have hardware nv Yes. So unfortunately these days, I, I've moved into a role now that is a lot of technical management, so I'm doing a lot of communicating, a lot of, a lot of architecture level work.
So if you're asking do I person. Not recently have I in the past. Yeah, and it was a lot of fun on, on our direct team, so you don't have to go more than one step removed from me. There are people who work primarily at the summit. We have world class astronomers, my colleagues Ryan Swindle, and Zach is Zach, both PhDs.
They work primarily. They work a lot at the summit and they are, I mean, Zach ReSTOR. This sensor on the trendon port of this three and a half meter telescope to do spectral imagery. We published about that. Now, Ryan is really one of the most masterful operators of that world class optical instrument alive today.
So they are operating these enormous like room sized systems, some of the most advanced technology the species has to offer to generate data, which they then come back to sea level or to, or, you know, if they live in Cool. Halfway up the mountain, they go back to their home and they train models on that.
Right? So it's, they're truly cross-disciplinary. It is a really motivating part of our work. It is a non-trivial component of the reason that we all work here. Not, not the hardware. It's not really the hardware itself, but that's cool. That wears off, right? What doesn't wear off is the fact that we are working out towards the edge of the species, technological capabilities and occasionally, Just occasionally bumping it forward a little bit. So anyway, yeah, that's, that's hardware. All right. So
Richie Cotton: Yeah, playing with giant telescopes on a mountain and then deep learning on a tropical , then yet that, uh, I've gotta say I'm jealous. That does sound pretty amazing. So thank you, Justin, for your time. It's been really fascinating just hearing what you do with your team and, and your work. Just some really important, exciting stuff. So thank you once again.
Justin Fletcher: Richie, it was my sincere pleasure. Thanks so much for your time today.
Data Journalism in the Age of COVID-19Betsy Ladyzhets discusses the importance of data shaping the narrative, and characteristics of traditional journalism needed for data journalists
Successful Frameworks for Scaling Data Maturity
How Chelsea FC Uses Analytics to Drive Matchday SuccessGet behind the scenes at Chelsea FC with Federico Bettuzzi to see how data analytics informs tactical decision making.
Successful Frameworks for Scaling Data Maturity
Ganes, talks about scaling data maturity, building an effective data science roadmap, navigating the skills and people components of data maturity and more.
How Chelsea FC Uses Analytics to Drive Matchday Success
Get behind the scenes at Chelsea FC with Federico Bettuzzi to see how data analytics informs tactical decision-making and driving match day success.
Inside the Generative AI Revolution
Martin Musiol talks about the state of generative AI today, privacy and intellectual property concerns, the strongest use cases for generative AI, and what the future holds.