Inbound calls
Incident Report for YourCloudTelco
Postmortem

Background

In Australia, we maintain two transit providers, Vocus and AAPT. Covering the remainder, we use Megaport's IX peering exchange as a kind of catch-all to help get us closer to customers not connecting through Vocus or AAPT. Yesterday, at around 10:40 am a significant public Australian corporation within the Megaport IX created an attack within the Megaport network called an ARP storm which had the effect of sending a tidal wave of ARP requests or pings to every address in our published netmask. The damage was not just to our services, but also our upstream carriers who went offline as a consequence of the storm. Usually, the cause of an ARP storm disappears as quickly as it arrives, even while the carnage created lasts for some time following. Fortunately, because the Australian and New Zealand network operators mostly know each other, we were able to identify the initial cause quickly.

Steps to mediate

I have never previously seen an ARP storm first hand of this style, perhaps because of the uniqueness of peering services like Megaport. While the probability of reoccurrence is low, we have taken steps to minimise the impact on our core internet routers should this happen again. More worrying for me personally, is the growing trend to obfuscate or hide public acknowledgement of network or operational issues. The internet is not perfect. Road works accidentally sever network cables; power outages occur; people make errors, myself included. My plea is to my colleagues, to CTOs, please acknowledge fault, even if you don't immediately know the cause - and then publish. In my experience, customers may occasionally complain but are mostly sympathetic to those crucial moments of identification. This trend of delay and denial is maddening.

Posted Sep 23, 2020 - 17:06 AEST

Resolved
This incident has been resolved.
Posted Sep 22, 2020 - 17:38 AEST
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Sep 22, 2020 - 12:22 AEST
Investigating
We are back to investigating the issue, we will keep you posted once we know more.
Posted Sep 22, 2020 - 11:51 AEST
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Sep 22, 2020 - 11:29 AEST
Investigating
We are currently investigating issues with inbound calling.
Posted Sep 22, 2020 - 11:07 AEST
This incident affected: YourCloudTelco Calling Platform (Network).