A Citywide Mesh Network – Science Fiction or Future Fact?

I recently finished Neal Stephenson’s excellent “Seveneves”. The plot is that the moon blows up due to an unknown force. Initially people marvel at the now fragmented moon, but due to the intelligent analysis of one of the protagonists it becomes clear that these fragments will keep fragmenting and eventually rain down on earth. The lunar debris turns into comets that start making the earth a less than pleasant and very hot place to live. In order to survive the human race decides to build a space station composed of a number of individual pods (designed by the architects!). This design is chosen in order to have the opportunity to evade incoming debris like a shoal of fish evades a shark.

Naturally there is no Internet in space but the natural drive towards having a social network (called spacebook) forces the always inventive human race to find another way to implement the internet. The resulting solution is a mesh network.

The principle of a mesh network:

“is a local network topology in which the infrastructure nodes (i.e. bridges, switches and other infrastructure devices) connect directly, dynamically and non-hierarchically to as many other nodes as possible and cooperate with one another to efficiently route data from/to clients”

The good thing about mesh networks is that every node can serve as a router and even if one or a few nodes fail (as they might in a space orbit filled with lunar debris) the network would still work. Contrast this with a network typology where one or even a few pods had central routers, like our present day Internet which is based on the hierarchical Domain Name System where traffic depends on a few top-level DNS servers. If these were all taken out the whole network would not work. With a wireless mesh network, the network would continue to work as long as there are nodes that can reach each other. But enough of the science fiction let’s get back to the real world.

The City Wide Mesh Network

New York City, where I work, has had its own share of calamities. Not quite the scale of the moon blowing up, but September 11, 2001, was still a significant disaster. The effect was that the cell network broke down due to overload. This greatly reduced first responders’ ability to communicate. In order for this not to happen again, NYC built its own wireless network: we call this NYCWIN. For years this network has served the City well, but the cost of maintaining a dedicated citywide wifi network is high compared to the price and quality of modern commercial cell networks.

However, the cellular network is also patchy in some parts of the city, as most New Yorkers have noticed. It is also expensive if we want to supply each IoT device in the City with its own cellular subscription. Typically a cellular connection will have a lot more bandwidth than most devices will ever use anyway. So, might it be possible to rethink the whole network structure and gain some additional benefits in the process? What if we created a citywide mesh network instead? It could function in the following way:

A number of routers would be set up around the city. Each would be close enough to reach at least one other router. When one router fails there are others nearby to take over the network traffic. These routers would form the fabric of the citywide mesh network.

Some of these primary routers would be connected to Internet routers either through cables or cellular connections. These special routers would serve as gateways to the internet. In this way the network would effectively be connected to the Internet and we would have a mesh Internet. This is actually not something new, in fact it already exists! It has been implemented by a private group called NYC Mesh: They have created their own proprietary routers for this, but wouldn’t it be cool if the City scaled a similar solution to use by all New Yorkers and visitors. Free of charge, like the LinkNYC stands. Oh and could they not maybe be the Internet gateways we thought of above? Think about it, what if wifi was just pervasive in the air of the City for everyone to tap into?

Better than LTE

The beauty of this is that this network may even be better than the cellular network, since it can better be extended to parts of the city that have patchy coverage from cell towers. We would just have to set up routers in those areas and make sure there was a line of connection to nodes in the existing network or an internet gateway. It would even be possible to extend the network indoor, even to the subway.

With thousands of IoT devices coming online in the future years, costs will increase significantly for Smart City solutions. Today it is not cheap to have a device connected through a cellular carrier to the Internet. Since it is essentially a cell phone connection, it also costs about the same typically. This may economically make sense compared to alternatives for the number of devices connected today. But scaling towards millions of devices, this approach is untenable in the long run. The City Wide Mesh Network could be a scalable low cost alternative for all of the City’s IoT devices to connect to the Internet.

Building and maintaining the network

It is quite an effort to implement this network and maintain it, but there is also a way to get around that. Today it is possible for commercial carriers to put up cellular antennas on City property if permission is granted. What if we made all permissions contingent on setting up a number of mesh routers for the citywide mesh network? Then, for every time a cellular or other antenna was set up, the citywide mesh network would be strengthened.

It could simply be made the obligation of the carriers that are granted use of City property for commercial uses, that they maintain their part of a free city wide mesh network. The good thing about a mesh network is that there is no central control and making it operational would just entail following some standards and add and replace network nodes. The City would have to decide on the standards to put in place: what equipment, what protocols etc. Not an easy task perhaps, but also not impossible.

In order to maintain the health and operation of the network monitoring would have to be in place. We could see in real time what nodes were failing and replace them. It would also be possible to elastically provision nodes when traffic patterns and utilization makes it necessary.

World Wide Standard

Now here is where it could get interesting, because the issue today in Mesh networking as in most other IoT is that there are no common standards. Vendors have their own proprietary standards and no interest in making it compatible. History has shown us that the only way to impose standards on any industry is through governmental mandate. New York could of course not mandate a standard, but what if the City forced all vendors who wanted to sell to the New York City Wide Mesh Network to comply with a given standard? The industry would have to develop their products to this common standard. Since New York has the size to create a critical mass this could possibly be the start of a new Mesh Network standard.

New York works together with a lot of other cities that often take inspiration from us in issues of technology. An example is open data, which originated in New York, but is now spread to virtually every city of notable size. The same could be the case for the City Wide Mesh Network design and standards used. That way, cities could have a blueprint for bringing pervasive low cost wifi to all citizens and visitors.

Fiction or Fact?

If a similar catastrophe to 9/11 were to ever happen again, then the mesh network would adapt and through healthy nodes still be able to send data around, possibly slower, but it would not fail. Only the particular nodes that were hit would be out, but the integrity of the network would be intact. It is, of course, possible that islands without connectivity would appear but that is to be expected. As long as the integrity of the network is unaffected it is ok.

It is actually possible to create a robust low cost, citywide network that would be developed and maintained by third parties with better coverage than cell phones all the while helping the world by forcing the industry to implement standards that would improve interoperability for IoT devices. This is not necessarily science fiction: everything is within the realm of possibilities.

The Data Deluge, Birds and the Beginning of Memory

One of my heroes is the avant garde artist Laurie Anderson. She is probably best known for the unlikely hit “Oh Superman”  in the eighties and being married to Lou Reed, but I think she is an artist of comparable or even greater magnitude. On one of her later albums is a typical Laurie Anderson song called: “The Beginning of Memory”. Being a data guy this naturally piqued my interest. It was sort of a win-win scenario. The song is an account of a myth from an Ancient Greek play by Aristophanes: “The Birds”. Here are the lyrics to the song :

There’s a story in an ancient play about birds called The Birds
And it’s a short story from before the world began
From a time when there was no earth, no land
Only air and birds everywhere

But the thing was there was no place to land
Because there was no land
So they just circled around and around
Because this was before the world began

And the sound was deafening. Songbirds were everywhere
Billions and billions and billions of birds

And one of these birds was a lark and one day her father died
And this was a really big problem because what should they do with the body?There was no place to put the body because there was no earth

And finally the lark had a solution
She decided to bury her father in the back of her own head
And this was the beginning of memory
Because before this no one could remember a thing
They were just constantly flying in circles
Constantly flying in huge circles

While myths are believed to be literal truth by very few people they usually point to some more abstract and deeper truth. It is rarely clear exactly how and what it means. But I think I see the deeper point here that may actually teach us something valuable. Bear with me for a second.

The Data Deluge and The Beginning of Memory

The feeling I got from the song was eerily familiar with the feeling I get from working with Internet of Things. Our phones constantly track our movements; our cars record data on the engine and performance. Sensors that monitor us every minute of our lives are silently invading our world. When we go through the streets of Manhattan we are monitored by the NYPDs system of surveillance cameras, Alexa is listening in on our conversations and Nest thermostats sense when we are home.

This is what is frequently referred to as the Internet of things. The analogy to the story about the birds is that until now we have just been flying about in circles with no real sense of direction or persistence to our movement. What is often overlooked is that the fact that we can now measure the movement and status of things only amplifies the cacophony of the deafening sound of billions of billions of birds, sorry, devices.

This is where the birth of memory comes in. Because not until the beginning of memory do we gain firm ground under our feet. It is only with memory that we provide some persistence to our throngs of devices and their song. We capture signals and persist them in one form of memory or another.

The majority of interest in IoT is currently dedicated to exactly this process, how do we capture the data? What protocols do we use? Is MQTT better or does AMQP provide a better mechanism? What is the velocity and volume of the data? Do we capture it as a stream or as micro batches?

We also spend a great deal of time figuring out whether it is better to store in HDFS, Mongo DB, or Hbase, should we use Azure SQL Data Warehouse or Redshift or something else? We read studies about performance benchmarks and guidelines to making these choices (I do at least).

These are all worthwhile and interesting problems that also capture a large part of my time, but it also completely misses the point! If we refer back to the ancient myth, the Lark did not want to remember and persist everything, it merely wanted to persist the death of its father, it only wanted to persist something because it was something that mattered!

What Actually Matters?

And this is where we go wrong. We are just persisting the same incessant bird song frequently without pausing to think about what actually matters. We should heed the advice of the ancient myth and reflect on what is important to persist. I know this is against most received wisdom in BI and Big Data, where the mantra has been “persist as much as possible, you never know when you are going to need it”

But actually the tides are turning on that view due to a number of new limiting factors such as storage, processing and connectivity. Granted, storage is still getting cheaper and cheaper and network bandwidth more and more ample. Even processing is getting cheaper. However, if you look closely at the fine print of the cloud vendors, services that process data and move data are not all that cheap. And you do need to move the data and process it in order to do anything with it. Amazon will allow you to store anything at next to no cost in S3, but if you want to process it with Glue or query with Athena it is not so cheap.

Another emerging constraining factor is connectivity. Many devices today still connect to the Internet through the cellular network. Now, cellular networks are operated by carriers that pay good money for the frequencies used. This money is passed on to the users. On average a device is not different from a cell phone, so naturally you have to pay something close to the price of a cell phone connection, around $30 to $40. I do get the enthusiasm around billions of devices, but if the majority of these are connecting to the internet through the cellular radio spectrum, then the price is also billions of dollars.

Suddenly, the bird song is not so pleasant to most ears and our ornithological enthusiasm is significantly curbed. These trends are sufficient to warrant us starting to think about persisting only what actually matters. That can be a lot, if you really have a feasible use case for storing for example for storing all your engine data (which you might), it could also be that the 120 data points per second from your connected tooth brush may turn out to probably not matter that much.

And I haven’t even started to touch on how you would ever find sense in all the data that you persisted to memory. Most solutions do not employ adequate metadata management or data catalogs or other solutions that would tell anyone what a piece of data actually “means”. If we don’t know or have any way of knowing what a piece of data means there is absolutely no reason to store it. If you have a data feed with 20 variables but you don’t know what they are, how is it ever going to help you?

Store what matters

This can actually be turned into a rule of thumb about data storage in general: The data should be stored only to the extent that someone feels it matters enough to describe what it actually is. If no one can be bothered to pin down a description of this variable and no one can be bothered to store that description anywhere it is because it doesn’t matter.