Ntirety - Notice history

All systems operational

Notice history

Mar 2025

Newark, DE - Network Event
  • Resolved
    Resolved

    We have finalized the maintenance to allow our operations teams management access and our provisioning engines to function within the portal once again. During the maintenance we noted zero network anomalies, the teams will continue to review the incident and provide a root cause analysis of the event, to be shared to all customers via support tickets once available.

  • Update
    Update

    As part of our recovery from the Newark (Nwk01) networking event. We are continuing our maintenance to restore management access to all switches. This effort will begin at 6pm Eastern Time. We will be performing this in waves, and monitoring the network before continuing. Once completed an update to this status will occur.

  • Update
    Update

    We are currently pausing any additional changes to our switches to restore management access as we have ability to review and monitor for other causes, during this pause there may be a an impact to our support teams ability to make firewall changes until we submit a change request which will occur after 6pm Eastern Time today (March 6th, 2025).

  • Monitoring
    Monitoring

    While we have implemented a fix to the downstream devices causing a broadcast storm, during initial troubleshooting we removed our management network from the NWK01 zone. To restore full management capabilities, we are adding the management network back to the switches and monitoring stability of traffic.

  • Update
    Update

    After removing the uplinks associated downstream devices, and removing a problematic virtual chassis we have seen significant improvement to the network stability. We continue to work with customers on individual issues from the network event.

  • Update
    Update

    We have identified a broadcast storm to a single virtual chassis, we are currently removing uplinks from the downstream devices we have identified and will monitor for stability.

  • Update
    Update

    We resolved the issue seen on the distribution router, unfortunately we have not seen stability return to the rest of the customer environment. We are continuing our investigation into the network loss/stability.

  • Update
    Update

    We have identified additional errors within the distribution router, which we are testing a resolution to currently. An update will be provided shortly with those findings.

  • Update
    Update

    We see stability for a subset of customers currently, however we still see saturation within the network and we are working to identify the cause.

  • Update
    Update

    We have removed problematic VLANs from routing, which has restored customer connectivity/stability.

  • Update
    Update

    We are performing an intrusive test against the aggregate switches associated to NWK01. This test may result in further instability to NWK01 before traffic is restored.

  • Update
    Update

    We have further isolated the issue to our aggregate switches, we are in contact with our hardware support vendors to continue to diagnose and resolve the errors. We will have another update in 20 minutes. The changes we make on the current switches, does not impact the secondary environment in Newark which remains stable.

  • Identified
    Identified

    We are continuing investigating a network looping event which is leading to sporadic network connectivity in a subsection of our Newark environment. We will have a in-depth update in 30 minutes.

  • Update
    Update

    The scope of the issue has been narrowed down to a single switching domain in the Newark DC.

  • Investigating
    Investigating

    We are currently investigating a problem with network connectivity.


Jan 2025

Newark, DE - Network Event
  • Resolved
    Resolved

    Ntirety has finalized our root cause analysis and will be sending to all customers via ticket and email within the hour. If you do not receive the document please feel free to contact support via phone (866.918.4678), chat or self-service ticket and a technician will provide to you the RCA.

  • Update
    Update

    We determined that a Layer 2 broadcast storm within the storage area network caused the service degradation event starting yesterday at approximately 3:00 pm ET. Network switches experienced repeated failures during the incident due to excessive packets per second.  Once the affected network segment was isolated, we restored stability to the customer environments. We are conducting a root cause analysis and will provide our formalized review by the end of the business day on February 3rd.

  • Monitoring
    Monitoring

    The platform has shown stability as of 18:30 ET, we continue to monitor the platform for any signs of instability and are working with customers individually for any known issues with their virtual machines. We are still under a root cause review and will share the RCA with all customers when available. Once the root cause is identified we will move forward with a maintenance to bring back redundancy.

  • Identified
    Identified

    All operational teams continue to restore virtual machines that lost connectivity to their guest OS. In parallel, we continue to investigate the main root cause of the switch issue identified, we will not move forward with another emergency change until this is identified connectivity will be single homed.

  • Monitoring
    Monitoring

    We have initiated a swap on the identified switch member, and currently monitoring the environment for issues and any further stability concerns. There is a known subset of virtual machines which need manual intervention to regain access to their OS Disk.

  • Identified
    Identified

    After moving forward with a reboot of a host and in addition a member switch, we correlated to the incident. The reboot allowed us to correlate the issue to a member switch, we are currently replacing the switch and will provide an update when replaced and failed over.

  • Update
    Update

    We have isolated an issue to a specific host in a unresponsive state. We are initiating an emergency change to address the host and will update the status

  • Update
    Update

    We continue to investigate the issue and isolating potential causes, the next update will occur at 15:40 ET.

  • Investigating
    Investigating

    We are currently investigating a problem with network connectivity.

Jan 2025 to Mar 2025

Next