(RESOLVED) Network connectivity problems

21:05 – We have stemmed the flow of the attack on our network, however, we will continue to monitor this over the coming hours.

20:51 – We are currently experiencing network connectivity problems, initial investigations shows that there is an attack being directed at our network.

We appreciate your patience whilst we work on resolving this issue.

(RESOLVED) Valhalla reboot

Planned Outage 21:00hrs

Due to essential maintenance, valhalla.krystal.co.uk we are planning to reboot the server this evening around 21:00hrs UTC. It is expected that the server will be down for around 5 minutes. We apologise for this outage, and expect the disruption will be minimal.

[Resolved] Zeus – Degraded Storage

[18th November 2014 – 19:15 ]  Zeus is currently running on degraded RAID – one of the hard drives has “Failed”. However this is the 3rd such failure in the last 12 months, and every drive we have taken out has been perfectly fine, so we’re just taking backups of the server/data incase there are any problems, and in the next few days we are going to take the machine offline for a few hours – do a full firmware update on the server, and hopefully bring it up with a new drive.

We are expecting this to happen on Wednesday or Thursday night, but we will update this status closer to the time.

[28th November 2014 – 15:25] Zeus is scheduled in for emergency maintenance tonight – we will attempt the firmware upgrades around 10pm onwards UK time – We expect there to be around 2 hours of downtime if everything goes to plan.

[28th November 2014 – 21:39] Preparing the server for upgrades now.

[28th November 2014 – 22:59] Zeus is back online – all Hardware firmware and Software has been upgraded to latest versions, and the storage array is in a valid state. We will continue monitoring the system over the coming days.

(Resolved) EMAIL CONNECTION ISSUES OVER SSL

We have been seeing connection issues for email clients connecting over the secure SSL ports. Our edge firewall was recently updated with a patch to harden our network against ‘poodle’ exploits but has also intercepted some legitimate use by email clients, we have been working on the problem and things should be returning to normal.

We are sorry for any inconvenience this has caused.

Connectivity Issues – Virgin Media

We’re currently seeing reports of users on Virgin Media having connectivity problems to some of our services. It seems the problem is with Virgin and their peering at LINX – We have contacted our upstream providers to alert them of the problem.

[Resolved] Athena Mail Blacklist

[Wed 08:00] We are currently investigating a problem with Athena being blacklisted on a mail block list.

[Wed 10:45] We’ve removed Athena from the blacklist and taken steps to combat any future problems.

Please be advised that it may take a few hours for the block to be removed across the internet.

We will update this post with more information as we find it.

[Complete] Poseidon Maintenance – 17th September 2014

We are planning around 10 minutes of downtime for Poseidon tonight between 10pm-11pm due to a physical server move.

This work is now complete.

[Resolved] Olympus Storage Degraded

Around 11am this morning we were alerted to a broken HDD in Olympus on the /home partition.

This is a RAID-10 array. We had engineers replace the HDD for a spare we had on site.

Unfortunately the on-site spare isn’t functioning correctly, and we’ve ordered a new drive from Dell.

This will not arrive until Monday the 4th of August, so until then, the drive array is degraded with potential full failure possible if another drive in the same stripe as the failed drive dies.

 

Monday August 4th – 12:15 - New HDD is installed and the RAID is currently rebuilding.
Tuesday August 5th – 09:45 – Storage array is now fully functioning correctly.

StriKe user interface failure

10:00 A problem has been reported with the web interface to the StriKe email filtering system. This issue has now been escalated to the developers, as we have not been able to fix the problem immediately. The routing and delivery of inbound messages has NOT been affected. However, we do apologise for users’ inability to access the StriKe control panel, and would like to assure you that we are working on obtaining a fix asap.

14:30 The issue has been fixed, and users are now able to log into the StriKe filtering service again. We apologise for the inconvenience this caused. The problem was due to a timezone conflict in an update schedule which was preventing essential updates to take place. During the time that the StriKe user interface was unavailable, mail delivery was unaffected.

Tagged

Network Latency

21:36 – We are currently investigating what appears to be latency on our network, at the moment it appears only some shared servers are affected, however, we will provide more information as soon as network engineers provide us with an update.

21:42 – Problem identified with network switch, scheduled reboot of device at 21:46.

21:48 – Switch restarted and networking fabric restored, all access issues should now be resolved, however, should you continue to have access problems,  please raise a support request at https://support.krystal.co.uk

Follow

Get every new post delivered to your Inbox.

Join 72 other followers