Real world: IR Story: No Turkey for you!!

Horror stories always begin with “it was a dark and stormy night,” with scary imagery meant to spark the imagination. Unfortunately for the client calling us on the day before Thanksgiving, this is not a tale written by Bram Stoker or the Brothers Grimm.  It was November 27th, 2019, and we were all getting ready to go home for the long weekend. Our service team received a call from a company that was not an existing client who sounded quite desperate. They were dealing with a malware infection, and as “luck” would have it, their systems admin was on vacation for the next two weeks out of state. Launching an attack on or right before a holiday weekend is a tactic that hacking groups have used for years and is frequently successful. IT staff, just like other employees, like to enjoy their time off and spend the holidays with their families. The Company workstations and servers had been infected by Trick Bot, a nasty malware that is usually followed by some form of Crypto attack. 

Our team of senior engineers, myself included, were dispatched to get an idea of the situation. After a brief from the remaining onsite IT staff, we began to review what logs existed. As is all too often the case, security had not been one of the highest priorities for this company.We found their network was largely flat, meaning they did not employ VLANs or firewalls separating parts of the network. The firewall they did have was outdated and did not have security services running. The workstations were mostly Windows 7 with many servers being older versions of Windows as well. Because of this, Trickbot propagated through their network very quickly. Our investigation found log files generated by the attackers which contained port scans and IP addresses to attack. It was becoming more and more obvious that this wasn’t a run of the mill malware infection and we were fairly positive there were active bad actors in the network. 

As we were investigating, the crypto attack began. Servers started failing and we started noticing traditional ransomware txt files with instructions appearing in directories. If you’ve ever had that sinking feeling while driving way too fast on the highway, then looking in your mirror to see the state trooper right behind you, tasting metal on your tongue while feeling the adrenaline start to pump and hoping you’re only getting a speeding ticket, you might imagine the feeling of the responding team and IT personnel that night. We found the crypto variant was called Ryuk, named after the demon from an anime called “Death Note.”.

The decision was made to disconnect all internet connections as well as shut down all servers and workstations to stop the spread. Since we didn’t know the state of the customer’s backups or disaster recovery plan, we hoped that by shutting everything down immediately, we could limit the damage to the infrastructure. While our plan to limit the encrypting of their infrastructure did prove to be successful, the damage had been done and there was no way to rule out reinfection after cleaning the systems we knew had been compromised. The decision was made to “burn it to the ground,” meaning that we were going to rebuild every server, workstation, HVAC control, VoIP system and anything else connected to the network from the ground up, and in the process rebuild it with the proper security design in mind. 

We started by inventorying the functions of all their servers and separating them into VLANs. This included things like domain controllers, mail Servers, database Servers and application servers. By separating servers by function, and by employing firewall rules that only allow traffic between the various VLANs that were needed for their functions, we can limit what services might be attacked. We did the same thing with workstations, only by department. This can help limit the landscape if a single workstation was infected to only the department of  that particular workstation and the servers they have access to. 

As I worked on the network design and building out the VLAN plan,. I had another part of our team start building a workstation to build an image from. This customer had over 300 workstations across their four buildings, and we weren’t about to have staff going from one computer to another installing Windows 10. We ordered a volume license from Microsoft that allows for imaging. We segmented a part of the network on which to start building new servers on and built a Windows Deployment Server. WDS is Microsoft’s imaging server that allows you to install an operating system to a completely formatted computer using PXE (Pre-Boot eXecution Environment). Once we had the base images built and driver packages ready, our team started on the 300 workstations. This would take us a good long time. 

As their network switches and firewalls were all legacy equipment and the customer had been planning to replace them, they elected to purchase new equipment as part of this engagement. Since it was the holiday weekend, shipping services were very busy and lead time from our usual vendors would take up to a week, so we called on a refurb vendor we’d used in the past and had a very good experience with. We were able to arrange same day delivery via a courier service, as they were in the same state as us. 

The rest of the story is much the same: rebuilding servers, reinstalling software, configuring switches and firewalls. While this went on for about three months, we were able to restore many of the critical functions in the first week and continued to bring services back online during the rebuild. This allowed the customer to keep their employees working. One of our highest priorities was to get the customer back in business and allow them to continue to service their customers. Most of us were running on about three hours of sleep each night and staying in local hotels so we could nap, shower and come back to work. After this project, the customer in question became a managed services customer who we still support to this day. 

If you’re reading this, please take it as a cautionary tale ofin what may happen if security is not taken seriously, updates are left to when it is convenient, and backups are only run every now and then. The expense of downtime, hardware purchases, and hiring staff or outsourced talent that are trained and experienced with proper network and system security is greatly outweighed by the expense of remediation when a breach occurs. End user training is equally, if not more important, as so many breaches begin because someone clicks on an excel spreadsheet containing a macro or html link that their computer with malware, keylogger or reverse shell.  

Jeremy Jackson – Network Engineer and Service Delivery Team Lead, Artemis IT