Machine versus manhole

Computer scientists take on a classic New York hazard

Every so often in New York City, a disk of cast iron weighing up to 300 pounds will burst out of the street and fly as high as several stories before clattering back to the blacktop. Flames, smoke or both may issue from the breach, as if somebody had pulled hell’s own pop-top.

LOOK OUT This manhole (yellow arrow) in Manhattan’s Chelsea neighborhood was ranked 20th most likely to explode, smolder or catch fire out of the borough’s more than 50,000 manholes and utility service boxes. Its oldest cable was installed in the 1920s; several others were replaced after a burnout event in 2004. C. Rudin et al./Machine Learning 2010

Manhole explosions aren’t just spectacular; they’re dangerous. As one firefighter observed after a manhole exploded near Times Square in May: “It’s not Disneyland, people. Get the hell out of the way.”

Ever since Thomas Edison fired up the city’s commercial electric grid in 1882, New Yorkers have had to contend with the random hazards of smoking, flaming and exploding manholes. Many of the blasts result from decrepit wiring, which can lead to sparks. Throw in a bit of gas and a confined space and, like a combustion engine, the blast can move metal. Until recently, there was no way of knowing where or when the next outburst would occur; repairs commenced only after a manhole had growled.

But in 2004 Con Edison began a proactive inspection program, with the goal of finding the places in New York’s snaking network of electrical cable where trouble is most likely to strike.

The company also called upon a team of Columbia University researchers for help in predicting which of New York City’s manholes might be the next to blow. Led by Cynthia Rudin, now at MIT, the scientists developed an algorithm that directs a computer to identify subterranean trouble spots. Now a report in the July issue of Machine Learning suggests the researchers are winning the battle of machine versus manhole.

“To us it was like solving an ancient puzzle, but one that we weren’t sure we were going to crack, and one that nobody had solved before,” Rudin says.

Rudin and her team tackled Manhattan first. Beneath the borough’s streets and avenues lies 21,000 miles of cable, enough to girdle more than three-quarters of the Earth.

The researchers set out to rank the manholes of Manhattan by vulnerability to serious events, such as fires and explosions. They had piles of historical data: Con Edison has records on its miles of cable dating back to the 1880s.The team also had 10 years worth of “trouble tickets” — more than 61,000 reports typed by dispatchers as they directed crews in the field.

Some tickets recorded relevant past events such as fires, explosions, smoking manholes or flickering lights. There was also a huge amount of irrelevant information, says Rudin: “Parking information for the Con Ed vehicle, or the fact that there is a customer that has a language problem, or other things like that.” Order had to be created from confusion, she says.

Knowing the past doesn’t necessarily mean you can predict the future, and Rudin wasn’t sure it could be done. Serious manhole events are rare — only a few hundred occur each year even though there are 51,000-odd manhole and service boxes in Manhattan.

“Finding a pattern when something is very rare is very hard,” says computer scientist Gary Weiss of Fordham University in New York City. “If you only have a few examples, there are so many patterns that can fit those few examples … you can’t really tell the difference between a pattern that is meaningful and one that is coincidental.”

The algorithm’s job was to “learn” from the past records and find meaningful patterns. Then it could predict the likelihood that a particular manhole with particular characteristics would have a future flare-up.

The researchers realized they had to take the long view. “We were not getting anywhere by trying to predict events in the short term,” says Rudin. They developed what they call a hot-spot theory. The team discovered that manholes with larger cables — and so a larger amount of insulation subject to decay and thus to sparking — turned out to be more vulnerable to serious events.

Con Edison blind-tested the team’s model by withholding information on a recent set of fires and explosions. The top 2 percent of manholes ranked as vulnerable by the algorithm included 11 percent of the manholes that had recently had a fire or explosion, Rudin notes.

Tweaking and adding more data has improved the model further, says Rudin, and Con Edison is now using it to help prioritize inspection and repairs on the grid. The team has just completed rankings for manholes in Brooklyn and the Bronx. And Rudin has plans to return to Manhattan’s grid, armed with the most recent inspection and repair data.

“I never really felt like a New Yorker, even though I lived there for several years,” says Rudin. “But contributing to the basic infrastructure of the city really helped somehow.”