Spire Maritime logo Freightflows logo

Webinar replay
Harnessing Historical AIS data for Machine Learning

Watch the Spire Maritime and Freightflows webinar on using Historical AIS data to feed ML models and AI, and learn more about:

  • The comprehensive nature of Spire's Historical AIS data and which fields are available
  • What industries and use cases can use Historical AIS to feed Machine Learning models
  • How Freightflows implements AIS data for training Machine Learning models

Webinar transcript

Olga Kadeshnikova

Hello everybody, thank you so much for joining us. Welcome to our webinar on Harnessing Historical AIS Data for Machine Learning. So we're very glad to have you here. We have a very exciting agenda for you that we can't wait to share. We also have some special guests, as you can see, both from Spire Maritime and from Freightflows as well.

So before I jump into the introductions, I would just like to draw your attention, whilst everybody's still joining, that on the right-hand side, there's a panel where you can ask us questions. So please go ahead and submit any questions you have there for us, as we will have a Q&A right at the end, where we will be able to answer them. And of course, if we don't get to answering them, we'll just reply to you by email afterwards.

So all right then, so maybe we can start off with a round of introductions for everybody before kicking off with the agenda. I can start with myself. So hi, everyone. My name is Olga. I'm Customer Success Leader at Spire Maritime, and my job is making sure that our customers get the most out of their subscription while working with us. And for those of you joining that are existing customers, I'm sure that you know me already. With me from Spire Maritime, I also have Shanu Suman. So I will hand it off to Shanu to introduce herself.

Shanu Suman

Hello, everyone. This is Shanu. I'm Senior Manager, Sales Engineering for Spire Maritime. Thank you, Shanu. So now over to the FreightFlows team, who we've been working with for almost over four years now.

Olga Kadeshnikova

So I think we can start off with Matt Morgan. Please go ahead.

Matt Morgan

Hi. Thank you all. My name is Matt Morgan, the Founder and CEO of FreightFlows. And yeah, we've been a partner with Spire for a long time, going back even longer than four years for previous companies as well. And so we're thrilled to be here and diving into this topic.

Olga Kadeshnikova

Brilliant. Thanks so much, Matt.

Greg Katz

Thanks, Matt. Great to be with everyone here today. My name is Greg Katz, Director of Business Development here at FreightFlows. I work closely with the Spire team as well as our team here at FreightFlows and our customer base in deploying our solutions to offer more predictive insights into maritime trade activity. And looking forward to speaking upon that further in today's webinar. I appreciate everyone joining in.

Sergey Crane

Hey, guys. I'm Sergey Crane. I'm the Head of Analytics at FreightFlows. And I'm actually using Spire data as our main source, feeding it into the machine learning models that we have in order to produce analytics with it. So I think I'm very excited to share some of the ways we're using the Spire data for it.

Olga Kadeshnikova

Great. Thanks so much, team. So I will start off with a general introduction about Spire and Spire Maritime before handing off to Shanu. So Shanu, maybe if we just go to the next slide, please. So Spire Maritime, well, rather, let's begin with Spire because we are part of Spire and Spire Maritime is a business unit. So our slogan is We Hear You Earth. What we aim to do is solve the Earth's greatest challenges. So we actually build our own satellites. We launched our first one, first nanosatellite in 2013. And we are listening to the Earth. We're picking up data with those satellites.

So of course, we are from the Spire Maritime team. So we're focused on the AIS data that we're collecting. In 2021, we became a public company on the New York Stock Exchange. And a couple of months later, we acquired the company Exact Earth, which is also a major satellite AIS provider. And overall, we have a very strong workforce, strong corporate values. We really focus on efficiency and serving our customers in the best way possible.

So yes, I think with that, that is the general introduction for Spire Maritime. And now I will hand off to Shanu, who will be giving more of an introduction into the actual machine learning side of things.

Shanu Suman

Thank you, Olga. So yes, hello again, everyone. This is the topics that I will be discussing today. I will briefly introduce machine learning, discuss the typical components of machine learning model, what are the historical data requirements for ML, some machine learning use cases for AIS data, and finally, what are AIS data requirements for it to be used in machine learning models and how Spire fits into it. So let's begin with machine learning.

What is machine learning, really? Machine learning is a branch of artificial intelligence and computer science that uses data sets and algorithms to imitate how humans learn, gradually improving its accuracy. So it's basically computer algorithms analyzing data sets and identifying some kind of patterns within them. In a typical machine learning model, we have three different types of components. We have a decision process, we have an error function, and we have a method optimization process. In general, machine learning is used to make either a classification or a prediction. And that's what the job of decision process is. It's basically used to identify some kind of pattern within the data set.

And then the error function comes into picture to check the accuracy of the decision process. It calculates the deviation between the output of the decision process against the actual data. And the idea is to minimize this error or the deviation as much as possible.

And that's what method optimization process is used for. It provides feedback to the decision process so that we can minimize the error generated. We can make the decision process more accurate. And this analyzing of data set and giving feedback to the decision process is actually a continuous process.

Now, why do we need any input data for machine learning? Well actually, historical data is basically the brick of any machine learning model. It's like the input that you provide to these machine learning models. Ideally, historical data is split into two parts. One part is used to train the model itself, and the other part is used to test the model. And it works on the concept of garbage in, garbage out. The more data, the better the data quality that you feed for training your machine learning model, the better the accuracy, the better the outcome of the models.

Now how can you use AIS data in machine learning models? Before that, let's discuss what is AIS data. AIS, or automatic identification system, uses transceivers based on vessels that allows ships to see the traffic in their area as well as to be seen by their traffic. So vessels transmit AIS messages on a periodic basis, telling others of their position and other information that I will also discuss later. Spire collects AIS data not from only its own satellite constellation, but also from a lot of third party providers to improve the coverage.

So let's discuss some machine learning use cases for AIS data and how it can be used in different industry. Shipping companies can use it for predictive maintenance of vessels. Maintenance is really critical for vessels to operate smoothly. And we can learn when a vessel needs maintenance before it's too late. So it helps minimize the downtime of the vessel and reduce maintenance costs. We can use historical AIS data to predict when maintenance is required. We can use it for optimal route planning to understand what is the most efficient speed or on which a vessel should run or what is the most efficient route a vessel should take on a certain voyage. Hence optimizing the whole route, decreasing the voyage time as well as reducing the fuel consumption. It can be used for emissions tracking. For environmental regulation, we can use historical AIS data to identify vessels that are generating more, that are using more fuel, hence generating much more carbon emissions than it should. So we can target such vessels for early warnings or inspect them. It can be used for carbon footprint reduction because we can study what is the most optimal route that a vessel should take, hence optimizing the fuel consumption of vessels in turn which will reduce emissions. For commodity trading, commodity traders need to predict the future demand for commodities. They need to anticipate market changes. This can be done by studying patterns in the historical AIS data. They need to react fast to market changes and using this historical AIS data, they can make more informed trading decisions, hence reducing the commercial risk on their own end. Supply chain can be optimized by studying historical AIS data by identifying bottlenecks, inefficiencies in the transportation network. And finally, in marine insurance industry, we see a lot of use cases for machine learning. It can be used for risk assessment, for fraud detection, for risk mitigation. Using historical AIS data, we can identify vessels that have higher risk of accidents. We can study the past incidents to predict the probability of future accidents or incidents. We can evaluate the risk of insuring a particular vessel or a particular fleet by understanding what kind of incidents that vessel has gone through in the past. This can help in determining what will be the appropriate premium or coverage level required to insure a particular vessel or a particular fleet. We can do fraud detection. We can identify patterns that may indicate fraudulent claims by studying vessels that are repeatedly reporting incidents from the same place or at the same time. We can use the same data sets to understand what kind of vessels are generating most claims, hence that might be a high risk route, and maybe avoiding such routes in future to reduce the risk. Now, these were some machine learning use cases for historical AIS data.

Now, what should a historical AIS data have so that it can be used in machine learning model and how Spire fits into it? Spire has more than 10 years of historical AIS data going back to July 2010, and we have been improving our volume coverage throughout the years. As you can see, now we are able to detect close to 270,000 vessels on an average on a daily basis. We have grown our daily message detection as well through the years. We have not only improved our satellite coverage, but we have also partnered with more and more data partners, data providers, to improve our coverage and provide uplift in different zones. As you can see in this 24-hour snapshot that we actually cover the whole earth and we do not have any gap areas. We have extensive positional and static data information, and let me show you a quick demo to demonstrate what are the different fields that we provide as part of historical AIS data. So, of course, you will have the vessel identification as part of the MMSI number, the IMO number, the vessel name. You will also have the call sign and the flag that the vessel is transmitting, the ship type of the vessel, the broad AIS ship type, the positional information, the latitude and longitude of the vessel, then the information that is changing continuously as the vessel is moving will be there. So the speed, the course, the rate of turn, the heading, the navigational status, some dimensional information of the vessel as length and width, and then some voyage-related data will be available as well. So the destination, the ETA, the draft that the crew was transmitting at any given point of time.

Going back to the presentation, we have highly granular historical data available in different formats. We have the raw NMEA format, which is the actual encoded format in which the vessels transmit the AIS data. It's not an easily understandable format, so we also have the decoded format in which you get more sense of the data and the different fields are visible, the one that I showed you just now. But each message is still either a positional AIS message or a static AIS message, for example. And finally, in the combined post-processed format where we combine this position and static AIS message so you have a wholesome view of the vessel at any given point of time in history. We have flexible data delivery options. You can choose to get one-time delivery as CSV files on a platform of your own choice, or there are some API solutions available as well if you choose to extract the historical data as per your preference and more flexibly. We have dedicated support teams as well. We have pre-sales engineering team to support you in selecting the optimal solution as per your requirements. And not only that, we have a dedicated customer support team to help you set up with the deliveries, with the API subscriptions if you choose to. We have a lot of documentation also on our website that will support you throughout the process.

And that was it from me. You can always reach out to us with questions at cx@spire.com. Thank you. Off to you, Olga.

Olga Kadeshnikova

Great. Thank you so much, Shanu. So everybody, you're going to see a poll in front of you if you wouldn't mind submitting that whilst we're moving on to the next section. So for the next part, I will hand it off to Matt Morgan to present about Freightflows. So Matt, over to you.

Matt Morgan

Thank you so much. I'll just pull my presentation up here. One moment. Okay. Just one moment. Just one more moment here, pulling it over.

Olga Kadeshnikova

So whilst we're waiting, everybody, please submit your questions. Don't forget, we will be here to answer them at the end.

Matt Morgan

So can everybody see my screen now?

Shanu Suman

Yep. Looking good.

Matt Morgan

Great. Well, thank you so much, Shanu. Really appreciate it. And it's a pleasure to be here and talk with you a little bit about machine learning, data science, and how we're leveraging Spire data to build out a product line for Freightflows to help our customers. So just to get into it a little bit. So Freightflows, we're a predictive analytics company for global trade. And why AIS data is important to us is it's really foundational for a lot of what we build on the product side. And as you can see here, with some of these sample charts, we deliver to our customers trade-specific, commercial-specific information that gives a lot more context about what is happening on the water, the types of trade that's happening, and what has happened, what is happening, and what will happen. So to discuss a little bit about how we build that up from the ground up, I want to talk a little bit about methodology here and talk about this, what we call a bottom-up approach to data. I think that an easier path for analyzing what's happening in commercial markets and specific trades is to look from the top down. So government reports, customs information, reports that you can use to try to make some predictions about what's happening in the market moving forward. We feel that that leaves a lot on the table. And in order for us to gain a lot more accuracy or visibility into what's happening on the commercial sides of the market, we've chosen to build that data set up from the ground level. And that ground-level data starts with AIS data. So AIS data is really foundational for us because it has a significant amount of coverage and it has great point-in-time information about what's happening at the ship level.

Greg Katz

Matt, sorry to jump in. I don't believe the audience is seeing the full-screen version of the presentation. I believe we're just looking at the PowerPoint.

Matt Morgan

Okay, well, I can try to switch this around here for you. Thank you for letting me know. Try that. There. How's that looking?

Greg Katz

Awesome.

Matt Morgan

Okay. So, at the foundational level, let me just jump back here. It really starts with using AIS data to build a data set on the geospatial side. And so I have some sample data here that's, or some sample visualizations here for how we're using AIS data in our calculation for building dynamic polygons. So I'm going to share a different screen here. And there we go. How is this showing up? Looks great. So here we're looking at an example here in South Korea. And how we're able to leverage AIS data that's different from how I think we've been able to work with commercial data in the past is that it really starts with taking sample data from the very lowest level, grouping and organizing and enriching that information into something that gives us visibility into the behavior of ships and commercial trade. So what we start with here is AIS data. And as you can see here, this AIS data for South Korea here in this sample is extensive, and it shows us a couple of really interesting things here. By color coding the AIS data by status, you can see that there are certain areas that are reserved for anchorages. As you can see here, these clamshell, what we call clamshells here are also indicators for ships swinging in the tides. And we can use that to begin to cluster information about anchorage areas. And the same thing can be done on the berth terminal and port polygon side as well. So we start with this data, but it's the problem with each of these individual dots is it doesn't provide us enough context yet. So in the past, how we might identify port calls is with these large polygon squares, circles around a port. And so in the past, you might've been working with port call data where a ship shows up at a port, you receive an alert because it's entered into this polygon or geofence, and you're able to know that that ship has arrived at the port. But what it's leaving on the table is what that ship has actually been doing at a much lower level. Did it arrive at the anchorage? Did it arrive at the port? Did it actually visit a berth or terminal? So these old definitions are very crude representations of what's happening commercially. So if we depart from that for a moment and we take a look at these movements of AIS instead, where we're connecting the AIS messages in time by vessel, you can begin to see that there's a lot of movement and activity that's happening at the individual vessel level. Of course, this dense level information is still not getting us quite what we wanted here is to look at more general behavior or organization of this data at the port level. So if we move away from these movements and instead focus on a couple other aspects of AIS. So as I mentioned, initially we had status. So grouping them by status and by frequency and status, you can see that there are these areas that are much more heavily trafficked, anchorage areas, and then terminal and berth areas as well. And these statuses are indicating that there are absolutely areas that we can begin to cluster and group to create the polygons that we use internally. You can also look at something similar with speed. So as you can tell, the main channels for which the ships are moving in and out of the ports are going to have higher speeds than the areas in which they're going to be anchoring or visiting the berths terminals here. Now moving on to what we did as an initial pass to build these clusters, our polygons here. So we were able to create a dynamic polygon that grouped all of the information for ship movements based not only on their positions and their movements and statuses and other parts of AIS information, but it does give you a pretty decent approximation of what's happening at the berth level as well. And as you can see here, these clusters, we were able to group them adequately to represent these berths at each of these different areas. But there still is more that we could do here to improve. As you can see, there's still quite a bit of noise as with any radio broadcast, which AIS is, is you have to account for the fact that it is a radio broadcast and how do we adapt to that information and group them more accurately for what we're trying to track, which is berth calls, tonnage movements, commercial activities at the berth level. So moving away from our polygons, we took a look again at the movements. So if we take a look at all of these vessels as they visited this anchorage and made their way to different parts, I'm going to just adjust this speed here just a bit. See if I can get that one. And you can see the movement of these ships as they're moving from anchorage areas to berths or from berths out to different areas. Now this movement tells us a little something about how vessels move within a port. The benefit of watching these movements and grouping them in this way is that we can begin to create an average of how ships move from area to area within a port. So this is something we call transfers. This average of movements from within a port gives us visibility into how ships are moving to specific areas within a port to conduct business. And as you can see here, we're able to begin to identify very bespoke differences between all of these berths within the port. So really it all culminates with our ability to cluster these and with a lot of different buffering and different activities to group them at the right level, we've built out highly dynamic and reproducible port polygons and berth polygons that give us the ability to conduct much more extensive analytics for ships that are interacting with ports and berths. And as you can see here, this clustering gives us the ability to easily differentiate with these different areas of a port with high degree of accuracy to account for some GPS noise but also to represent behaviorally what's happening at the vessel level. So just popping back over here to our slideshow once more. Are we looking at the main slide?

Greg Katz

Yes.

Matt Morgan

So then taking that foundational AIS data and creating this dynamic geospatial dataset allows us to create more informed analytics. And so one more visualization to show here is how we are using that data now to conduct more analytics on the behaviors of vessels themselves. So now we're looking at New York. We've built out our dataset globally to represent every anchorage area, port, terminal, and berth in the world dynamically using the full Spire dataset to do so. So here looking at New York, we were able to accomplish a couple things once again. So if we take a look at some representation of berth level data down here, let's take a look at our clusters once more. Now we have our clusters. So we're able to identify these areas in the channel here, our anchorage areas because of the AIS activity that we see here. Another anchorage area here outside of the channel here in New York and other anchorage areas that represents a transitional stage for ships that are moving in and out of different ports. And why this is important that it's dynamic is, as is the case in the US West Coast where we had significant lineups there, the Port of LA Long Beach, anchorage areas that represented in polygon form that are not dynamic miss a lot of data. So we had to grow and shrink the anchorage areas based on observations that we had from Spire's AIS data. So how does this inform us about activities at the ship level? So one of the things that we're able to do now is look at the movements between these polygons that we've created. So these movements here is what we represented here in these arcs show that there are specific actions or polygon or legs from polygon to polygon that we're able to identify. And how this helps us is that we're able to use that data set to begin to inform a historical data set of very granular level movements between different areas in the world. So we're going to take a look at another short animation here where I will once again change the speed. See if I can get that right. And now we're going to take a look at ships as they're moving from a birth going back in time to what their previous activity was. So in this case, as you can see, these are our birth polygons that are down here at all these different areas within the port. What was the previous activity that these ships were able to do? Let's see if we can skip this head a little bit. And speed this up just a bit. It's a bit fast. So it looks like I'm having a little difficulty getting this at the right speed. But if I move it just a little bit slower, you can see that as the ships move from their births here out to their previous activity or previous polygon interaction, they would move out of the channel and return out to these major thoroughfares or channels that they were beginning to move back in time. And aligning these in time shows us that there are points that they diverge. Some move off to show that they came from other ports and move directly to the births. And in these cases, these ships moved back to this anchorage polygon, which gives us information about how we think that these ships may move in the future. And if we take one more zoom out and take a look at our AIS data here, you can see that that grouping has been able to accurately identify the movement areas that these ships take as they're moving in and out of ports and conducting business. So I will stop here and begin to talk a little bit more about how we're able to use that data in constructing our analytics. So I'm going to pass this over to Greg here to continue talking about these. So Greg.

Greg Katz

Thanks, Matt. Appreciate it. I'd love to now just spend a few minutes discussing really how critical this data foundation that Matt has outlined here is to our solution set and what it actually means for our customer base to have access to not only such a granular view of vessel movement activity, which is enabled by all this AIS data and the enrichment process that we undergo, but also what it means for the output of what our models are producing, being trained on such a unique and granular data set. And so as we take this base layer of AIS information and overlay it onto what are our proprietary port and geospatial foundations, we're ultimately able to then produce what is truly an actionable view of understanding trade activity and true commercial context as it pertains to any particular region or port or birth or terminal in the globe. As you can see, the data is down to that level of detail and ultimately demonstrate what vessels are doing from a commercial perspective as opposed to just a location perspective or in addition to just a location perspective as well. And so then as we begin to include other alternative data, a variety of both public or privately and confidential data that we have access to from a variety of different sources and our customer base across customs information, fixtures and statement of facts data, terminal owner and operator data and knowledge about what cargos are loaded and unloaded at any particular birth or terminal, we can start to piece together and connect the dots to create truly enriched market insights that provide that additional layer of knowledge into our model set to produce our forecast. And so we're able to then display and produce across the entire global fleet. We've generated our models and algorithms to be adaptable to all fleet types, trades and cargo routes, trade routes and cargo types, rather than just specializing in one particular area where we might naturally reduce or introduce rather a lot of bias into our models and data set. We're making sure that our models are utilizing samples from all competing fleets and trades to make sure that we're building in a perspective and therefore modeling all markets and their associated vessel movement pathways, allowing our complete predictive data set to perform at much higher, much higher accuracy as well. And so then as we build up to kind of this last step here, which is really focused on now that we have a really core and granular foundation of rich history, a rich perspective of what's happening on the water today, and layering on all this additional commercial context to apply to all the regions and pathways that these ships are actually moving on, we're able to then start producing a unique and predictive data set that takes all of this information about the current environment and stretches it out into the future. And so we here at Freightflows are really focused on near-term predictions, as we think that the near-term period in terms of predicting fleet or down to the individual vessel level of activity is really the most important for creating actionability from commercial decisions. So rather than being focused on long-term time horizon forecasts in which we're analyzing and modeling impacts based on particular macro level events or geopolitical effects, instead we are most highly focused on observing and having our models observe and react to recent and current vessel movement activity and understanding and predicting what is likely to happen in the near term based on these associated shifts. So we're looking at what is happening today and understanding what shifts are likely to come out of that current observed market in the coming weeks that you and your business can act on now. As so many decisions in the industry are based on having to have a view into the future, we believe that having a forward-looking dataset is really helpful and supportive of making forward-looking decisions. And so as we kind of complete this perspective of our bottom-up approach, it really allows us to then make granular predictions either at the individual vessel level, because we are making observations, as Matt mentioned, across every commercial vessel on the water today. We're making predictions about next ports of call for those vessels and ETAs for not only current voyages in terms of where a vessel might be heading right now, but also making predictions about understanding all these dynamics that Matt demonstrated at the individual port level, understanding how much time is that vessel likely to spend at its arrival port, what commercial activity is it likely to undergo there, what types of cargos are most likely to be loaded based on its berth or terminal that's being visited within that port, understanding the particular ship parameters, perhaps the owner-operator of that vessel. We can make predictions about that ship's commercial activity through its current voyage out four to six weeks in the future as well, predicting voyages that have not yet begun, because we're observing across the whole fleet and making a simulated forecast about what is most likely to happen in the near term based on all those observations. And so as we aggregate this, again, either at the vessel view or up to the market view, we're providing a really detailed perspective of predicted commercial activity ultimately, not just predicted vessel movements in terms of a vessel going from one latitude longitude to another, which is helpful, of course, for understanding where a ship is going to be at a particular time. But when we're layering on the actual predicted commercial activity that is going to be undergone at any particular time now into the future, we're able to provide a lot more unique and actionable insights regarding forecasting near-term vessel supply or availability out four to six weeks into the future, forecasted trade activity in terms of what commodities are likely to be loaded and discharged along those different pathways, and insights regarding port congestion in terms of perhaps there's an increase in a particular fleet that's heading to a region to load a cargo over a particular season, having a perspective on what the predicted impacts to that will be in a particular port or region as associated with those commercial activities is really helpful to then start using the forecasted information to drive current business decisions today and tomorrow. So as we kind of flip to the next page here, as we're looking then and thinking about our customer base, our goal is ultimately to provide and deliver our customers the opportunity to benefit from a predictive data set and ultimately market transparency. And so at Freightflows, we're really doing that through providing a more complete and granular and explorable view of real-time trade activity than what is likely previously available or taking a much data-driven, a more data-driven perspective to an analytical approach. Of course, layering on these proprietary and predictive models that are so key and core based in this rich AIS data that we're receiving from our partners at Spire that allow us to have the best visibility into vessel movements down at the port, berth, and terminal level and allows us to generate those unique polygons and analytics that Matt demonstrated as well. And we take and cover all market segments, which again, provide us a lot more complete visibility into trades across all regions of the globe. So naturally we're able to work with and do work with customers across not only a number of trade segments and regions, but across different customer types as well and players in the market. So we work with ship owners and operators, physical and financial traders, other maritime service providers as well, such as brokers and agents, ultimately to support a wide variety of key and common strategic use cases that come out of having a predictive view of trade activity. And so whether that be understanding near term shifts in vessel supply in a particular region out six weeks for any specific fleet, we have owners and operators using that data to help support their fleet management strategy and ultimately, cargo owners who are needing to fix cargoes out three or four weeks in the future, or an owner who wants to evaluate loading a cargo in a region that they haven't really explored a trade route into, our data set can flex and meet our customers where they need us the most because of this bottom up approach. And it allows our customers to use that data at the level that makes the most sense for their business. You know, extending past that, because the birth and polygon level of data that we have, because we're constantly adding and further enriching based on new market data that's coming in from those variety of sources I mentioned, we also are able to produce a really unique perspective on commodity flows, as we have a great sense of what products are loaded and discharged at any birth and terminal across the globe. As we're also observing vessel movements, we're also taking in data from AIS reported information such as draft changes and reported ETAs and destinations being reported by AIS data as well from the data we received from Spire that allow us to analyze and understand commodity flow movements and produce a view of supply and demand trends to our customers, again down to the birth and terminal level as we layer it onto our data set in that regard to support a wide variety of strategic market activities. As we actually package and deliver this data to our customers, there's many different formats in which we work with, one of which is being shown a quick visualization of here on the page, but we're really focused on taking a tailor-made approach to meeting our customers and partners where they need us the most to drive strategic and predictive decisions and make more efficient processes. We hear so much in the industry about having to look across multiple data sources or offline reports and spend time emailing and phone calls. We want to create a centralized and fully customizable tool for each of our customers engaged with a querying through our large analytics output, display here showing, viewing the current and future predicted global distribution of any particular fleet, looking at activity at a particular port, whether it be from historical or predictive arrivals, and as well as cargo flow analysis, which I've touched on as well. Ultimately, as we are providing this data or as we have data scientists likely in the audience here today, being able to take this alongside AIS data and use it as a way to enrich a perspective or layer on a set of models, whether you're working with predictive data today or have a predictive model that you're using today. We work with our customers and feeding into that with some of our predictive analytics as they're so uniquely specialized in understanding these granular movements, as well as providing a comprehensive market intelligence platform for our customers to log into at the beginning or throughout their day and having a complete view or granular view of understanding what is trending in the market to help drive those strategic decisions as well. So I'd love to kick things back over to the group here to open up for some Q&A. Would be happy to have anyone in the audience today get in touch with any of the team members at Freightflows individually as our contact information is up on the page. Or if you have a general inquiry, feel free to get in touch and we'd love to see how we can be supportive of using our enhanced and predictive data set to support some of your business decisions. So thank you again. I appreciate all your time and we'll pass it along back to the group at Spire and Freightflows to open up for Q&A. Thank you.

Olga Kadeshnikova

Brilliant. Thank you so much, team. That was really great to see. It's always wonderful to see the data come to life and to see exactly how you get the most out of it. So thanks so much for going through that with us. Again, there's a quick poll on the screen. So please take a moment to answer that one for us. We have had quite a lot of questions. So I think we can go straight to those. There is questions both for Freightflows and for Spire. So I'm just going to glance over here at the questions that we have. So the first question we have is going to be for Freightflows. So team, if you could please answer, what kind of models can take advantage of historical AIS or Freightflows datasets?

Sergey Crane

So I think I can take that one. So we forecast quite a few things at Freightflows, you know, destination, ETA, cargos, volumes. And we use many different types of machine learning models to do that. But let's – I'll use destination forecasting as an example. One type of model that works very well for this is high recurrent neural networks, specifically LSTMs. So LSTMs stand for long short-term memory. And these models essentially have a memory. So they remember recent activity when making a forecast of what the vessel will do next. So if you're feeding them AIS messages, they are not limited to only looking at current location and origin ports, say. Instead, they can use maybe several days of location data. And this can help detect things like a vessel making a mid-journey destination change and other kind of more complex behaviors that if you were only looking at where the vessel is currently located, a model like that would not be able to detect, like, a change as soon as it happens, like a change in direction. The models also can use the enriched data that Freightflows creates. They keep memory of recent trade activity, of journeys that the vessel has recently completed and cargos. And that also helps inform the destination forecast. So that's just one example of a machine learning model that can take advantage of all the data created by...sorry, that is provided by Spire and then also the enriched data that Freightflows creates that Matt and Greg were describing.

Olga Kadeshnikova

All right. Great. Thank you so much, Sergey. The next question that we have, also for Freightflows, how does your model incorporate low-probability high-impact events? Do you add realistic disaster scenarios to the learning dataset?

Sergey Crane

I can take that as well. So we do have a lot of probabilistic models included in how we make forecasts. So one of the main things we use is Monte Carlo-based models, and they do simulate a realistic set of scenarios that can represent a wide range of things that could happen, including some low-probability events. Obviously, you have to run enough simulations in order to capture things that happen quite rarely, but that's one way to do that. You pretty much have to, because as the questioner pointed out, there are kind of black swan type of rare events that can have a large impact. It's very important to sort of capture all the possible scenarios in modeling this type of behavior.

Olga Kadeshnikova

Fantastic. Thank you. So the next question is specifically to do with the polygons and how they're created. So how do you distinguish an anchorage polygon from a berth polygon in building the model? For example, by defining the speed of the vessel, et cetera?

Matt Morgan

I'll jump in here. So I mean, there's a lot of different factors at play here. Speed is one. In fact, speed is used a lot for stop detection, right, if the ship's position hasn't moved very much. And you'll have differences based on AIS reported speed or calculated speed, which is speed calculated between two points. So that certainly matters a lot. But there's a lot of things that go into differentiating between those. Some of them are how close these stop events are to land. And the anchorage areas are primarily not adjacent to land as well. Status information for AIS messages provides help. Of course, this is just on the waiting side. We can't take for gospel just about anything that we take in as an input. We always have to wait them appropriately. And so status is also helpful to begin to do that. And over time, you get enough of a sample size to be able to group them. And also other things that we might have as well. We use a lot of proprietary data or commercial data from some sources. Us as an independent third party, Spire as well, we're not actively participating in these trades, which gives us the ability to come stay in between them and collect data from both the owners operators and the cargo owners as well. Gives us a perspective into each of their commercial activities. Our customers find that by sharing as much information, commercial information that they have, creates a much better experience for them because our models and our outputs are informed by their business. We take a very focused approach to keep the confidential data confidential, but use all the commercial data to inform things like polygon creation and predictions and cargo identification and behaviors of the ships, which improves the model and the output for all customers. So it's a combination of a lot of public, private, and new behavior detection as inputs create them. And just to throw one quick thing in as well. When we were looking, I remember showing the AIS spots. You could see this pattern that Matt described of vessels going in a circle around the anchorage, around the anchor. And that is another type of behavior that you could detect that's typical of anchorage stops and very different from a vessel that's at a berth.

Olga Kadeshnikova

Awesome. Thank you so much, guys. So we have a practical question next. So the question says, I have seven days of AIS data from Mombasa port in Kenya. Is this data enough in terms of volume slash size to train a ship collision avoidance model based on machine learning algorithms?

Matt Morgan

Well, I mean, we're so data hungry. We would probably say no. But it's all like shades of gray here. How accurate do you want to represent this type of model? And machine learning models are hungry for more samples, more data, more features. And as that data set grows, and certainly in history, going back further and further or having more the scope of the data grows, or having adjacent data sets, all of those things can really have a substantial difference in the performance of the model. I think that really a model can be made with any amount of data, but you may not receive the results that you expect or want out of those. And just as Sergei was saying earlier, like with the Monte Carlo on the Monte Carlo side, not doing enough simulations doesn't give you enough alternative pathways for what might happen. So more simulations or more data are always going to have an impact, but it's going to be a long tail for contribution to the accuracy of the model. Most of that, the impact will happen at the beginning. But as the tail goes longer and longer, you can approach higher and higher accuracy numbers. Sergey, do you have any other points on that?

Sergey Crane

Yeah, it's an interesting question. I mean, it's not a type of model we thought about very much, any type of collision stuff. But one potential problem with it is that you wouldn't have sort of a lot of history of that happening in order to be able to sort of like figure out exactly the type of behavior that would lead to it. Unless I'm wrong about that, but it's possible. But I would say that that's probably not an obvious use case.

Olga Kadeshnikova

Okay, thank you so much. So now some questions for Spire. So I'll be looking over to Shanu to help with these. So there are a few questions which I think are on the similar kind of line. So I'll read them out to you, Shanu, and then we can see if you want to address them separately, we'll kind of combine it together. So the first one, when looking at post-process data, how far is this processing? Is it only the merge of static and dynamic messages or is Spire also addressing spoofing and ship conflicting issues? So that's one. And then the other one, do you only rely on MMSI slot contained info in order to identify ships in a passive way, or do you target search and retrieve ID per vessel from your own database? Or finally, do you use a retrofit database in order to duly notify vessel space positions temporarily?So I think here, what we can kind of get at is talk about our deduplication algorithm that we have in Maritime 2.0 and how this is helping customers essentially to have more of that cleansing, more processed data. So when they save that data into their database, it's an extra level of processing. So Shanu, maybe if you can review how that looks on our side.

Shanu Suman

Yeah. So first of all, I'll talk about the combined post-processed question and how far it goes back to. So I think the combined post-processed formats goes back to even 2011, but that is standard AIS data where we join based on MMSI number and there is no special processing done there. Last year, we launched new generation of our APIs to track vessels on a live basis. So that's called a Maritime 2.0 GraphQL API, where we also did some kind of deduplication where we try to identify vessels not just based on MMSI number, but also on the past position that it reported. So we do not report jump based on the positions. And that's kind of the thing that we do in the data, which you can track for live vessels. This data only goes back up to mid-2022. Historically, we have not done this deduplication yet. And maybe there was a third part of the question, Olga, that I do not remember. Yes. If you can repeat that.

Olga Kadeshnikova

It was talking more about the vessel IDs specifically. So this is where I wanted to bring up the Maritime 2.0 deduplication algorithm and how we're doing that. So it was about, do we identify the vessels per ID from our own database?

Shanu Suman

Yes. So we have also done, when we are doing the deduplication, we have also given special Spire ID to all these vessels. So we now have fixed ID for each vessel. That's definitely there. But it doesn't go back historically, as I said, way far back. It only started last year.

Olga Kadeshnikova

All right. Thank you very much. So we have had a lot more questions coming in. We have seven minutes. So for everybody that doesn't get their question answered, we will come back to you by email. And of course, you have the emails of the team here. Feel free to get in touch. And then for Spire, just contact cx@Spire.com. We will come back to you with questions if you have some for us too. So moving back to Freightflows. So team, we have a question here. Hi. Do I need to include Anchorage Polygon together with the Berthing Polygon for calculating port efficiency? Or would only the Berthing Polygon be sufficient? Also what data will help address this problem, like from the data types available in your initial slides, i.e. raw process combined?

Matt Morgan

I mean, I can take part of that, which is that it depends on kind of what you mean by port efficiency. I mean, for some measurements around port efficiency, you would want to know how much time is spent at the Anchorage versus how much time it takes to get from the Anchorage to the Berth. For anything like that, you would need to have both Anchorage and Berth polygons. So that's sort of the high level answer.

Sergey Crane

Yeah, I'll jump in there because I think that hits on a key point. What is port efficiency in your definition? If I mean, there are a couple of different ones that I'm thinking about right now. One might be the throughput at the Berth level. Another one might be the lineups at the Anchorage. Another might be the maneuvering time between those different spots. And some of the queuing that happens, especially within channels as well. So I think that there's a lot of different definitions of efficiency here. So I think that you really need to hone in on what you're specifically trying to measure. But certainly having more visibility into the specific actions taken from within the port at the highest granularity possible is going to give you a lot more options in being able to produce efficiency results. And those efficiency results can be surprising, especially if you measure them in different ways using multiple sets of polygons or movement information.

Olga Kadeshnikova

Great. Thank you, Matt. So next question, also for you guys. Do you use a gray box model?

Matt Morgan

Sorry?

Olga Kadeshnikova

A gray box model, so this was in the question. I'm not technical enough to know all the different models. I mean, if not, we can always take it offline.

Matt Morgan

I'm not familiar with the gray box model. Perhaps Sergey is. I'd say that we've tried and tested out a lot of different kinds of models, including models that are not historically used within maritime. We've taken methods that we've seen from outside of maritime, places in retail or in other land-based logistics. We've tried a lot, even health care as well. We've tried models from a lot of different non-obvious sources. And each of them produce really interesting results.

Sergey Crane

So, yeah, sorry. I just wanted to jump in. I can say, so we do use black, white, and gray, so all kinds of models, meaning the question is mostly about sort of visibility into how the model is coming up with its answers. So for example, a regression model where you know all the coefficients that the model uses would be a white box model. So you'd sort of have full visibility as to what it's doing. And LSTM, which was the example I brought up before, would be a black box model because you have very little ability to figure out why the model is forecasting what it's forecasting. But it still has plenty of uses. It's just, you know, you sort of have more risk as to, like, maybe it has some assumptions that you're not aware of. But it definitely has its uses. And then the gray ones are sort of like attempts to understand what neural networks are doing. So you sort of try to view these things and figure it out. So we do use all three to some extent. And I believe all three kind of have their uses. With maritime analytics, you know, you don't have as much need for the gray box models because, you know, there isn't like sort of social bias or any type of problems like that that some neural networks could have because it doesn't really apply to maritime. You do have to make sure you have a representative sample, but that is much easier to do than, say, like facial recognition or something like that. So but but there's something really we do want to know what it is that our models are doing. So some gray box models are being used by this.

Olga Kadeshnikova

OK, great. Thank you so much. So now back over to Spire Maritime to Shanu. What is the real time frequency in terms of time interval of AIS input data feed? As I will be using Spire for my trajectory forecasting. I think what we're talking about is probably the latency and the refresh of the vessels.

Shanu Suman

Exactly. So we have a very low latency data available as well. You can get less than one minute of latency in receiving Spire AIS data. So you can definitely subscribe to that. The refresh one changes globally depending on where the vessel is. So it would be best if we can calculate it specifically for your vessels of interest or for your area of interest.

Matt Morgan

I'll even contribute to that. Like AIS itself will produce messages at 20 second intervals when they're near land or maneuvering. But when they're at speed or or further from land, you're going to get a lot fewer messages. This is where I think Spire really excels as well in their coverage. They're able to capture more messages being in low Earth orbit and having faster refreshes when they're away from land to try to fill in those kind of black boxes that you may have had with other more traditional satellites, AIS coverage companies. But AIS message itself is tricky like that. You'll have a significant amount of data and over represent some areas and less data when you may want to get data updated. So it's one of the algorithms for refreshes or latency that AIS uses, which or Spire uses, which we appreciate is kind of keeping the last known message up to that point in time. And we found it to be more than sufficient in our modeling at the refresh rate that Spire has been providing.

Shanu Suman

That's very true. Terrestrial AIS gives very low refresh rates in coastal regions and Satellite helps us in open sea areas. And then we have the third flavor, dynamic AIS, which helps us give a great uplift in high traffic zones. If I combine the three sources, a standard median refresh rate is actually just six minutes. But ideally, this is transmit position messages very frequently as masses two to three seconds and static messages are transmitted every six minutes. But those are in ideal cases.

Olga Kadeshnikova

Great. So thank you everybody for participating. We are out of time. All the questions that didn't get answered, we will come back to you by email. So please bear with us. And yeah, thanks to all of the panelists as well for your great contributions. Really fantastic to see. And hopefully we will speak to everybody again very soon. All right. Thanks everyone.

Panelists

  • Olga Kadeshnikova
    Olga Kadeshnikova
    Customer Success Leader
    Spire Maritime
  • Shanu Suman
    Shanu Suman
    Sr Mgr, Sales Engineering
    Spire Maritime logo
  • Matt Morgan
    Matt Morgan
    Chief Executive Officer
    Freightflows
  • Greg Katz
    Greg Katz
    Director of Business Dev.
    Freightflows
  • Sergey Crane
    Head of Analytics
    Freightflows