(Resolved) EMAIL CONNECTION ISSUES OVER SSL

We have been seeing connection issues for email clients connecting over the secure SSL ports. Our edge firewall was recently updated with a patch to harden our network against ‘poodle’ exploits but has also intercepted some legitimate use by email clients, we have been working on the problem and things should be returning to normal.

We are sorry for any inconvenience this has caused.

Connectivity Issues – Virgin Media

We’re currently seeing reports of users on Virgin Media having connectivity problems to some of our services. It seems the problem is with Virgin and their peering at LINX – We have contacted our upstream providers to alert them of the problem.

[Resolved] Athena Mail Blacklist

[Wed 08:00] We are currently investigating a problem with Athena being blacklisted on a mail block list.

[Wed 10:45] We’ve removed Athena from the blacklist and taken steps to combat any future problems.

Please be advised that it may take a few hours for the block to be removed across the internet.

We will update this post with more information as we find it.

[Complete] Poseidon Maintenance – 17th September 2014

We are planning around 10 minutes of downtime for Poseidon tonight between 10pm-11pm due to a physical server move.

This work is now complete.

[Resolved] Olympus Storage Degraded

Around 11am this morning we were alerted to a broken HDD in Olympus on the /home partition.

This is a RAID-10 array. We had engineers replace the HDD for a spare we had on site.

Unfortunately the on-site spare isn’t functioning correctly, and we’ve ordered a new drive from Dell.

This will not arrive until Monday the 4th of August, so until then, the drive array is degraded with potential full failure possible if another drive in the same stripe as the failed drive dies.

 

Monday August 4th – 12:15 - New HDD is installed and the RAID is currently rebuilding.
Tuesday August 5th – 09:45 – Storage array is now fully functioning correctly.

StriKe user interface failure

10:00 A problem has been reported with the web interface to the StriKe email filtering system. This issue has now been escalated to the developers, as we have not been able to fix the problem immediately. The routing and delivery of inbound messages has NOT been affected. However, we do apologise for users’ inability to access the StriKe control panel, and would like to assure you that we are working on obtaining a fix asap.

14:30 The issue has been fixed, and users are now able to log into the StriKe filtering service again. We apologise for the inconvenience this caused. The problem was due to a timezone conflict in an update schedule which was preventing essential updates to take place. During the time that the StriKe user interface was unavailable, mail delivery was unaffected.

Tagged

Network Latency

21:36 – We are currently investigating what appears to be latency on our network, at the moment it appears only some shared servers are affected, however, we will provide more information as soon as network engineers provide us with an update.

21:42 – Problem identified with network switch, scheduled reboot of device at 21:46.

21:48 – Switch restarted and networking fabric restored, all access issues should now be resolved, however, should you continue to have access problems,  please raise a support request at https://support.krystal.co.uk

Scheduled Poseidon Move – 18/07/2014

Today is the scheduled migration of Poseidon to a new server.

We have already provisioned the new server and prepared it to become the new Poseidon.

10:00 – Data transfer has started between the servers to prepare for account transfer.
12:00 – Data transfer has occurred much quicker than anticipated and we may move the servers earlier than planned
13:00 – We’ve moved over to the new server now, much earlier than anticipated – we still need to reboot the server and move it physically
14:39 – The server move is complete – please let us know if you have any problems

Network Maintenance – Wednesday 16th July 06:00 (Completed)

Dear All,

The network upgrade that was postponed from Friday has been rescheduled for this coming Wednesday morning, starting at 06:00

We have made use of the extra time to further improve our preparedness and taken on board feedback about the time of maintenance windows. Hopefully early mid-week will affect fewer people in the eventuality of disruption.

We will update this blog post as the work takes place.

Thank you for your patience during this essential network maintenance.

 

06:00 We are starting this work now.

06:30 This work has been completed successfully. Thank you for your patience.

Network Maintenance 11th July 18:00 (Postponed)

Dear All,

We will be undertaking critical core network maintenance tomorrow evening commencing after 18:00.

This has the potential to affect all servers/services and while we hope that there will be no noticeable impact there could be a loss of connectivity of up to 2 minutes.

We thank you for your patience and understanding during this maintenance window.

20:55 We have started this work

21:36 We’re experiencing intermittent issues that we believe are caused by a software bug in the switch stack. We’re rebooting it now.

21:50 Network configuration has been reverted to the previous setup for the moment, all services should now be fully restored.

——-

23:08 – There appears to be some network latency/packet loss which is unrelated to the earlier network operations, we are currently investigating the cause.

23:27 –  The network has returned to optimal preformance, we apologise again for any inconvenience that this may have caused.

Follow

Get every new post delivered to your Inbox.

Join 69 other followers