Often during my discussions with other analysts or people interested in the field, I'm asked about Web Servers and how to investigate incidents regarding them. Yes.. I know I know, I have general conversations about web servers compromises. What can I say, I enjoy DFIR! However, during these discussions, it becomes clear that there is a mystery that seems to shroud web servers. How can these become compromised? When they're compromised, what artifacts/logs exist? The good news is, if you're somewhat familiar with investigating compromised systems/servers, analyzing a web server is not much different. These servers are still running on an underlying OS, such as Windows or Linux. Many of the same artifacts exist; however, we'll talk about a really useful log known as "Web Server Logs". This is purposely generic, as depending on the web server software, the logs may be named something different. Well, the next time you're alerted on suspicious activity from a web server, take a look at this blog to identify some artifacts/logs you can utilize to understand the incident and hopefully determine root cause!
Investigating a compromised web server is similar to analyzing a compromised system or server
Collect a triage image of the server. Artifacts such as Windows Event logs, artifacts, file system, registry, etc. will still exist
For Linux servers, many artifacts within the file system and /var/log will still be present and ready for analysis!
Web Applications run on web server software. A web server runs on software that can handle and interpret web requests. Common web server software are IIS, Apache, NGINX
Web applications provide functionality of the site. For example, databases, features, logins, etc.
Web servers will often contain Content Management Systems (CMS), which help simplify web application and website features and management
CMS' can utilize plugins, extensions and overall a large number of features
Some examples are Wordpress, Drupal, WIX, Magento, Joomla, and Kentico. Though there are many others
Web servers are often exploited by leveraging vulnerable, misconfigured and outdated plugins, extensions. This is more commonly observed within outdated plugins provided by CMS
OWASP Top 10 is a great resource to understand common vulnerabilities leveraged by Threat Actors (TA) regarding web applications
Web servers, such as IIS and Apache generate web server logs. These contain observed HTTP requests to the server. THESE ARE KEY FOR INVESTIGATING ROOT CAUSE AND INTEL
Webshells are often observed in compromised web servers. These are essentially files that can be interpreted/compiled by web server software and can provide remote access to a TA
Web server logs will often not record detailed records of observed traffic. Often, these contain summarized information of the request, such as the time, source/destination IP address, URL, URI, useragent, etc. The payload and encapsulated data are often not included. A PCAP is typically required for this or enhanced logging
Lets jump right in! As mentioned previously, the web server logs are going to be key to the investigation. Now, when it comes to DFIR, we're typically brought in once an incident occurs, right? This is more of a reactive role. So there has to be some sort of "event" that occurred that requires further investigation. Well, this is also very important for analyzing web servers. These logs, such as IIS or Apache logs, can be extremely noisy. It'll be very difficult to jump right into these logs and start identifying malicious activity. As analysts, we need to work with our knowns. What do we know about the incident? Did an alert trigger from your endpoint or network tools? Did this flag on a specific IP address or filename? With the alert, we have a timeframe right? Lets use this information to pivot our investigation! If you have a timeframe, head to the IIS logs and see what happened at or around that time. If you have a suspect IP address, plug that IP into your logs and see what comes up! From a DFIR perspective, this is known as "pivoting", and when an investigation first begins, its important to stay focused and work with the known facts regarding the incident. Focus on the confirmed malicious and/or suspicious activity.
In this blog post, we'll discuss how to investigate a compromised Apache web server! Though both IIS and Apache will be very similar in how they are investigated. Although there are many ways to investigate web server logs, such as SOF-ELK, Splunk, etc. In this example, since the logs will often be in a regular ASCII text format, a simple "grep" syntax will do the trick! Sometimes, you can even utilize "grep" to narrow your search down and export the results to a text file. Afterwards, you can import the file into something such as Excel to better filter your results! All in all, just know that there are many ways to analyze these logs, its the mindset that will be the most important part!
Although this will be more commonly observed on Linux systems, it is possible to run Apache on Windows. In this example, we'll take a look at a compromised Apache server. Lets give a sample scenario!
On October 11, 2023, you're notified via a customer that their website has been defaced! Although the root cause is currently unknown, we do have a few things to go off of. Lets think of some things we can pivot off of.
This allegedly occurred on October 11
Has anyone reviewed the webserver yet? There are likely files that have been modified or created for the defacement. What are the timestamps of these files?
Were there any alerts that were triggered on this server? WAF, Proxy, IDS/IPS, etc. etc. If so, lets use these timestamps!
There are many other questions we'd obviously want to ask regarding the incident and compromised server, but for blog purpose sake, let's use the above scenarios for now.
As mentioned earlier, grab those Apache logs! Commonly, these can be found within the '/var/log/apache*' directory. Though, note that this default location can be changed. Here, you'll likely see a few files such as "Access.log" and "Error.log". Possibly others depending on the configuration and use of the server.
The key one here is the "access.log", as this will record the web requests to the web server. Again, keep in mind that this will not record the content of the request, only metadata regarding it. A full PCAP will be required to capture the packet content such as the payload.
Now, keep in mind that these logs are recording ALL of the web requests to your web server. These will be very noisy and there will likely be multiple "access.log" files depending on your retention rate, storage and how long the server has been up for.
So, what do these logs look like? Lets check them out!
Pretty straight forward, right? We can see the IP information, timestamp, HTTP request, targeted resource, HTTP response, user-agent, etc. All great information that will be useful in terms of collecting IOCs for your analysis!
We have our timeframe of October 11, 2023. The IT team reviewed the server and saw a suspicious HTML file named "pwnt.html" was uploaded at around 21:00 (9pm) or so. Great! Let's run a super basic "grep" command to filter these results. To start, let's think about how web requests work. Something was uploaded, so this must be an HTTP 'POST' or 'PUT', right? Although a response could be a redirect (300~), let's focus on successful responses at first, such as a '200' response code.
Again, these logs can be very noisy, so take all the information you can to pivot off of. Although there are many ways to compromise a website, lets say this server had a well known and actively exploited vulnerability that allowed someone to arbitrarily upload a file. Since this is Apache, lets say this was running Apache. With this, it's pretty common to see 'php' webshells for example. Curious on what a webshell is or how it works? Well.. that's a blog for another day! In the meantime, check out these resources:
In a nutshell, a webshell is a type of file that can give a Threat Actor a... well.. a shell! A way to run arbitrary commands on the server through some sort of interface. Although these shells can be a simple one liner, some can be very complex and provide a full GUI, allow someone to upload/download files, and more! Thinking back to how a webserver works, in order to run commands to a webshell, this is still a web request. So.. someone would need to request their webshell, which resides on the server somewhere, and send HTTP requests to it in order to access it! So lets 'grep' on response 200, POST requests and anything matching a '.php' file!
Boom! Check that out! The Apache logs recorded each request that the threat actor made to run commands on their webshell called "shell.php". Don't believe me? Lets try it out! I'm going to utilize an open-soruce webshell and make a simple "whoami" command through it.
Okay.. I ran the command. Based off of our previous notes.. this should generate an HTTP 'GET' request, because we want to request the interface of the webshell.. right? So that should return a 200 response. Once we execute the command, this should be an HTTP POST, correct? Lets see!
Success! We can see that the 'POST' request was generated when we ran the 'whoami' command through the webshell named 'shell.php'. Now if this were a public facing web-server, you should be able to see the source IP (if not, determine if there's something that sits in front of your web-application such as a WAF, reverse-proxy, load balancer, etc.). Make sure you have 'X-Forwarded' enabled to see the true source IP! Once you identify that, you can then pivot your grep search on. Start grepping for that IP address or user-agent! See what else the Threat actor did!
Since the overall concept is very similar to investigating an Apache server, I'm going to keep this section sweet and simple. As mentioned previously, you'll often see IIS associated with a Windows server. In this case, typically the default logging directory for these logs are located in C:\inetpub\logs\LogFiles. Here, you'll see a single folder or many folders with the name 'W3SVC*'. Please note that each W3SVC folder will represent your sites within your IIS manager. If there are multiple sites hosted within IIS, there will be multiple folders.
As always, lets go take a look at this directory! In my case, I only have one site installed.
To get a better understanding of which site is assigned a particular ID, you'll need to access your IIS Server Manager. Here, you can select your site and go to "Advanced Settings" to view the "ID". This ID will represent your logging folder found within 'inetpub\logs\LogFiles'. Why is this important? Well.. you want to make sure you're looking at the correct IIS logs for the compromised site in question!
So lets take a look at these IIS logs in a text editor for now. Keep in mind that a real environment will likely have very large IIS logs and a text editor will not suffice. As mentioned previously, you'll want to parse this within a log collecting software, such as Splunk or utilizing command based searches, such as 'grep'.
Awesome! We can see a very similar format to what we saw on the Apache server. Here, we can see a suspicious HTTP 'GET' request to 'webshell.aspx'. Note that this returned a '404' response, which was just for my testing. However, this is likely what you'd see when investigating a suspicious request, IP, timeframe, etc. in your own production web server! From here, once you identify a suspect IP, timeframe, useragent or request, you can pivot off of these and begin creating a timeline of what happened!
As I mentioned previously, although this was just a test environment and there wasn't a lot of traffic in my examples, the overall concept is just a few steps you'd take in a real case to investigate a compromised webserver.