| | By John Oram in California @ Thursday, August 21, 2008 9:48 AM
| |
| | The problems started on Tuesday with intermittent offline for AT&Ts Broadband Advanced Virtual Private Network (AVPN) infrastructure which carries multicast data. Wednesday, it escalated into a full scale system shut down. The problems were a “Class 7” outage, which means the PhD level staff at Bell Labs was put onto this one.
VPN is a private network that uses a public network (usually the Internet) to connect remote sites or users together. Instead of using a dedicated, real-world connection such as leased line, a VPN uses 'virtual' connections routed through the Internet from the company's private network to the remote site or employee. Many companies are creating their own VPN to accommodate the needs of remote employees and distant offices. There is VPN software for most operating systems including mobile smartphones, mobile Internet devices (MIDs), notebooks, desktops, and servers.
Multicast is a popular feature used mainly by IP networks in the enterprise. The multicast feature allows a single stream of information to be transmitted from a source device, regardless of how many receivers are active for the information from that source device. The routers automatically replicate a single copy of the stream to each interface where multicast receivers can be reached. Therefore, multicast significantly reduces the amount of traffic required to distribute information to many interested parties.
We received a phone call today about the AT&T system outage from our friend who is a private radio system operator. He uses about three dozen circuits to support communications for a fleet of trucks. Though they pay dearly for a ‘multicast aware’ network, everything came to a halt when routers were dumbed down to survive and keep the most traffic flowing. So much for a premium service. For obvious reasons, they request we not use their company name.
AT&T's AVPN where the problems began, is a flagship service offering from AT&T using MPSL (Multi Protocol Label Switched) traffic with a vast array of features including multicast.
Something changed in the network on Tuesday, whether in the operating system change or something in the configuration, but the multicast traffic threatened the entire infrastructure. To save most traffic, all multicast was stopped.
Countless hours were spent troubleshooting the problem. It was only when AT&T was called that they admitted they had a problem. The issue had such an impact that the president of AT&T was made aware of the matter. Certainly credits for the outage will amount to a significant sum, along with unmeasured bad will.
As usual, AT&T could not pick up the phone and call their customers, but the customers were not shy about phoning them. By Wednesday, with customer calls arriving like an alpine avalanche, the customer support staff at AT&T started telling customers that the boss was on top of this one.
We wonder if AT&T's publicity people will ever bother to admit they had a problem? We also wonder if AT&T will really refund any of the money spent by our friends and their clients for a service that still isn't working as advertised? X
| |