Our work applies some very relevant practical constraints. It works off the premise of operating with very few sensors, which would have to be the case in reality. The question that has to be answered is where best to put those few sensors.
Pedro Pinto
Researchers at the École Polytechnique Fédérale de Lausanne (EPFL) published a paper last week in the journal
Physical Review Letters which describes the development of an algorithm for finding sources within large interconnected networks.
With hierarchical structures, it can be relatively simple to trace the origin of a rumour, a virus or contamination, for example, but with vast and complex entities such as social networks or transport networks the task is made much more difficult. The chances of monitoring every node and every connection are non-existent.
As such,
Pedro Pinto from the Audiovisual Communications Laboratory, along with Professors Martin Vetterli and Patrick Thiran, also of the EPFL, have tackled the problem with an algorithmic solution which could have many potential real-world applications in future. I spoke to Pinto to find out more about how this approach works and whether it has any foreseeable limitations. I began by asking why modern communication networks make it more difficult to detect the source of a message or rumour and why past efforts have been fruitless.
"This is the first work that I’m aware of which has attempted to establish a mechanism for detecting the source of a rumour," Pinto said. "In the past it has been assumed that we can observe all the nodes of a communication network. This would mean putting a sensor in every single node of the internet, for example, which is simply not possible because there are billions of devices."
For Pinto and his colleagues it was crucial to accept from the outset that resources and funds will always be limited.
"Our work applies some very relevant practical constraints," he explained. "It works off the premise of operating with very few sensors, which would have to be the case in reality. The question that has to be answered is where best to put those few sensors."
One of the examples cited by the team is a scenario where a rumour is started on Facebook with a message being sent to 500 of your friends and friends of friends. The algorithm the researchers developed could make it possible to find the origin of the rumour by looking at messages to as few as 15 recipients. Pinto explained the mechanism whereby it is possible to root out an individual source by monitoring so few points in a large network.
"In the way that this algorithm works, we were inspired by what happens in mobile networks," he said. "A technique called triangulation is able to pinpoint the location of your cell phone by measuring the distance from it to three or more cell towers. With three distances you can draw three big circles which will intersect at your location. We were inspired by those ideas and we applied them to other networks. What we’re doing is exactly the same; we’re triangulating the source with very little information."
Although the study is only indicative of a preliminary level of development and only considered three or four case studies, Pinto is hopeful that there are many other directions to be explored. The team carried out practical studies with real networks to validate their method, including modelling based on the sarin gas attack on the Tokyo subway system in 1995.
"We looked at detecting sources of contamination in infrastructure networks. For example if a terrorist released a contaminant at one subway stop, we were able to determine where that contamination originated on the subway network.
"We also looked at detecting the source of cholera in an epidemic that occurred ten years ago. We took real data on human transport networks, water networks and river networks and determined which village was the source of the contamination, having looked at only a fraction of those villages affected."
As well as the types of situation mentioned above, the method is also potentially useful for detecting the origin of spam messages and computer viruses, and for the development of more effective marketing strategies.
One of the weaknesses of this approach is that, for the algorithm to work, it is imperative to know exactly how the nodes are related to each other. In the case of social networks, for example, there are very particular constraints on how individual users are connected which limits the possible sources to relatively few nodes.
"To continue with the Facebook example, it is necessary to know exactly who is friends with whom," Pinto told me. "This information is not available very often. If you are looking at a terrorist network you don’t know who is connected to whom. You don’t know who talks to whom. That’s one of the difficulties, but I think it can be overcome with more research."
Pinto also explained that whether the approach can be used in a preventative capacity will largely depend on the specific nature of the target application.
"In the subway contamination example, I think it could be used for prevention," he offered. "By prevention I mean that the authorities would be able to contain the contamination as quickly as possible; I wouldn’t go as far as to say they could avoid it completely, but once it had happened they would be able to quickly pinpoint the source and then contain it."
Now that the algorithm to locate the source has been constructed, the next step for Pinto and his co-authors is to consider methods of determining the points at which networks should be monitored to obtain the best coverage with minimum resources.
"We’re looking at how and where to physically place the sensors in order to minimise cost," he concluded. "We need to ask, for example, which nodes of a social network should be monitored in order to detect rumours."