Leave me alone! Internet

February 5, 2025
Last modified 2/6/2025

I am not sure what you call the bots that just scour websites for content, and I assume exploits. It's annoying. And expensive.

And honestly, I am not sure there's anything I could do about them. They have different IPs and I am not sure I should exclude countries, though maybe I could. 

I have zero idea how to stop them and I am not concerned with exploits. Most of the time they're looking for Word Press exploits. it seems that they request certain files, like JS or CSS and if it results in a good call then they've made contact. And then they can follow-up with an attack on whatever plane they need to. Probably, SQL injection stuff on known plugins. I have seen a few SQL injection attempts, but not too many. There are other checks for js, min and even txt files that I don't recognize, but I am going ot assume it's the same attack vector since I have never had php, WP, Joomla or anything else but .net on my sites. So.

What's the problem?

Two things: 

Monitoring - Run for Nothing

I get lied to by Azure when my alert goes off that I have had x errors within y minutes and shit's going down. I run to my computer and I see 300 requests for “/wp-content/xxxx” - WTF bots!? Not Azure's fault, but you know what I mean. It is comical to see the rudimentary vectors. They're just exploiting known, easy ways that some mediocre coding would prevent.

Monitoring - Cost

Depending how you have you monitoring set up, these alerts and logging can actually get quite expensive., You see, they are doing hundred and thousands of requests on your site. Each one comes with logging requests, dependencies, errors, etc. All of it added to your beautiful Grafana chart at the price of something. Not me, I have the cheap App Insights charts :P

Monitoring - What Can you Do

So, luckily there is something you can do for known vectors that are not needed by you. While this is specifically for App Insights, I am sure you can apply the logic in whatever system you're using.

Exclude it. Just like that person you quit inviting to your kids' birthday parties - exclude the visitor or at least the single request. And the earlier the better.

Here's how I do it in AppInsights using a TelemetryProcessor. I won't get into the weeds because there's a lot and really good docs for it, but basically

services.AddApplicationInsightsTelemetryProcessor<SuccessfulDependencyFilter>();

public class SuccessfulDependencyFilter : ITelemetryProcessor 
{
   public void Process(ITelemetry item)
   {
       if (Ignore(item))
       {
           return;
       }

       this.Next.Process(item);
   }
}

And inside of that we'll do some logic that determines whether we log the request.

Here are the parts that matter, here's a quick rundown of the code:

  • return true to ignore and false to log (continue as normal)
  • Check to make sure we have the properties we need
  • Decide what status codes we want to check for (e.g., 404)
  • Process it inside some specialized methods for operations, URLs, etc.

Here's an abbreviated:

  if (item is RequestTelemetry requestTelemetry)
  {
      if (requestTelemetry.Name.StartsWith("HEAD"))
      {
          return true;
      }
      if (requestTelemetry.ResponseCode == "418")
      {
          return true;
      }
      if (requestTelemetry.Context != null && requestTelemetry.Context.Operation != null && requestTelemetry.ResponseCode != null)
      {
          if (requestTelemetry.ResponseCode.StartsWith("4"))
          {
             if (Ignorer.IgnoreUrl(requestTelemetry.Url?.LocalPath))
             {
                return true;
             }
          }
      }
  }

OK, now that we which requests we want to process, what do we ignore? Pretty simple, there's actually a couple things to check here, depending whether we're checking a dependency, a page request or a method. Most things can just check the value for your known culprits, but it;s as simple as

if(value.Contains(theValue, StringComparison.OrdinalIgnoreCase)
{
    return true;
}

Here's my list of ignored requests and extensions. If your site needs to support these then of course you wouldn't ignore them. json and xml might be needed in your app, for example.

  • .php
  • .dat
  • .zip
  • .json
  • .xml
  • .sql
  • .git
  • .env
  • .com/js
  • RPC_OUT_DATA
  • rpxpproxy
  • autodiscover
  • wp-
    • wp-includes
    • wp-content

 

There are a few other things that I ignore. Super fact responses with no codes. Not sure what these are, but I've had thousands fill up the log and get me to pay $2.30 GB is sorta cruel. So…

 if (dependencyTelemetry.Duration.TotalMilliseconds < 50 && dependencyTelemetry.ResultCode.IsNullOrWhiteSpace())
 {
     return true;
 }
if (dependencyTelemetry.ResultCode == "418") //this is cute. Did you know that this was a thing??
{
     return true;
}

That's the gist. Get into the processor, look at what's happening and decide what to ignore. And while it doesn't stop the scammers and hackerz from getting into your stuff, it'll save you some mullah since your logs won't fill up!