Application Insights and Annoying Storage account 404s

July 30, 2022
azure application-insights billing

Application Insights fills up (and costs you $$$) when expected 404s are treated as exceptions. What's a dev to do?

There are some changes coming to the SDKs and libraries that will allow you to set whether you want a 404 to be an exception or not. Go read more about it - there are like 10 interrelated issues/threads on GitHub. This is the main one I am following, but not where I started: [LLC] Move RequestOptions into Core · Issue #1666 · Azure/autorest.csharp (github.com)

Application Insights is pretty great tool. And, yes, I am sure the others are just as good (New Relic, Data Dog, etc.). This is just the one that I use. However, it's not free (though it used to be a lot cheaper). And by not free I mean it can get really expensive if you slip. But there are a few things you can and should do.

I am not going to get into all right here. Maybe sometime in the future. I really just want to write about one thing. But it would be unfair not to mention the other options. Otherwise, it's like sticking your hand in a bag of Doritos and coming up with some chips, a hand full of orange sticky dust and not knowing what to do next. 

Sampling - don't send everything, send a percentage. All things being equal, this will save money. However, if you get mucho issues at the same of the day, this could hurt. Also, this really kicks in when you get a lot of data at the same time. Which is nice. You may not need to know that you got 500 errors in one second because you fubbed a connection string. No, may I have never done that.

You can also exclude some types from sampling. So, it still can bite you. I have NOT gone into great depths reading about the settings. So, go read about it.

Cost Cap - DEFINETELY DO THIS. DO THIS. DO THIS. Tell app insights what's the max you want. Set it to something. And remember, it's ~$2.00 a day per GB. So, if you keep the default of 100/GB/day - you can spend $200 a day of insights logging. You notice three days later…100 * 3 $2.00 = $600. Google it. Read people cry about it.

What I did want to write about…

OK - now what I wanted to say and bitch about.

I love storage. It's fast and cheap. It's a great replacement for DB calls that you cache anyway for things like settings, small lists (or biggish), etc. I use Redis. But why worry about outdated cache entries (how/when to expire them) for data that can go in storage tables and get returned in milliseconds. Yes, this is a win. Read poor man's cache…(need link)

But what sucks is that checking for an entry, which might be null and you're Ok with that will result in a 404. They practice proper REST practices. And as such, they then throw an exception. Even if you catch and swallow, it shows up. Something to do with the way they call it internally. And now your app insight is telling you you have issues when you don't. I say “you don't” assuming that it's OK to have 404s, like checking for a user's settings that may not be set (which is OK).

This sucks, we want to ignore these and catching is not good enough. Argggg. We use app insights, so let's fix it there. ITelemetryProcessor to the rescue.

ITelemetryProcessor

This thingamajig will run with every request and for anything that passes through app insights. So let's plug it in:

    

//START UP
        services.AddApplicationInsightsTelemetry();
        services.AddApplicationInsightsTelemetryProcessor(typeof(ShouldIIgnoreYou));
    //A NEW CLASS
        public class ShouldIIgnoreYou : ITelemetryProcessor
        {
            private ITelemetryProcessor Next { get; set; }
            public ShouldIIgnoreYou (ITelemetryProcessor next)
            {
                Next = next;
            }
            public void Process(ITelemetry item)
            {
                if (Ignore(item))
                {
                    return;
                }
                Next.Process(item);
            }       
            private bool Ignore(ITelemetry item)
            {
                if (item.Context != null && item.Context.Operation != null)
                {
                    if (item.Context.Operation.Name != null && item.Context.Operation.Name.StartsWith("HEAD"))
                    {
                        return true;
                    }
                }
                if (item is RequestTelemetry requestTelemetry)
                {
                    if (requestTelemetry.Context != null && requestTelemetry.Context.Operation != null)
                    {                    
                        if (requestTelemetry.ResponseCode == "404")
                        {
                            if (Ignorer.IgnoreOperation(requestTelemetry.Context.Operation.Name))
                            {
                                return true;
                            }
                            if (Ignorer.IgnoreUrl(requestTelemetry.Context.Operation.Name))
                            {
                                return true;
                            }
                        }
                    }
                }
                else if (item is DependencyTelemetry dependencyTelemetry)
                {
                    if (dependencyTelemetry.ResultCode == "404")
                    {
                        if (Ignorer.IgnoreOperation(dependencyTelemetry.Name))
                        {
                            return true;
                        }                    
                    }
                }
                return false;
            }
            public static bool IgnoreOperation(string name)
            {
                if (name == null)
                {
                    return false;
                }
                if (
                    name.StartsWith("GET storage_account_name/table_name1", StringComparison.OrdinalIgnoreCase) ||
                    name.StartsWith("GET storage_account_name/table_name2", StringComparison.OrdinalIgnoreCase)
                   )
                {
                    return true;
                }
                return false;
            }
            public static bool IgnoreUrl(string url)
            {
                if (
                    url.Contains("AUTODISCOVER", StringComparison.OrdinalIgnoreCase) ||
                    url.Contains(".php", StringComparison.OrdinalIgnoreCase) ||
                    url.Contains(".zip", StringComparison.OrdinalIgnoreCase) ||
                    url.Contains(".json", StringComparison.OrdinalIgnoreCase) ||
                    url.Contains(".xml", StringComparison.OrdinalIgnoreCase) ||
                    url.Contains(".sql", StringComparison.OrdinalIgnoreCase) ||
                    url.Contains(".git", StringComparison.OrdinalIgnoreCase) ||
                    url.Contains(".com/js", StringComparison.OrdinalIgnoreCase) ||
                    url.Contains("RPC_OUT_DATA", StringComparison.OrdinalIgnoreCase) ||
                    url.Contains("rpcproxy", StringComparison.OrdinalIgnoreCase)
                    )
                {
                    return true;
                }
                return false;
            }
           
        }

That's the complete code. Let me walk you through what's going on here. After adding it to your startup, it will run through on every request.

if Ignore(item) returns true, it short circuits the application insights flow and nothing is tracked. If it's not ignored, we will call Next.Process(item) and it continues until eventually it is logged. Thus, this processor will allow us to check if we want to log it and if we don't, stop it. bwahahaha.

In the Ignore method is where things get a wee bit interesting. I had this wrong for a while, because the documentation only shows RequestTelemetry.

I am actually checking for three different things here.

First: I hate HEAD errors. I never care. I will never care. Do. Not. Tell. Me. I do not want to waste $.003 on storing this tidbit. So, check and exit if it's HEAD.

Second, if it's RequestTelemetry, which is a normal Request, I want to ignore some URLs. I have found these to be external services, bots, monsters, Russian Cyber Hackerz and whatever else. I don't have these, and I do not care that they're 404. I just want to move on with life.  I do have an entry for .com/js - which is only there because my files are in content, but there are a number of scrapers that look for that folder while searching for vulnerabilities based on your libraries. Yeah, I don't care. 

Third: The reason we started this convo. We need to check for DepedencyTelemetry (not the request). This one got me for a while. Once we have the correct telemetry type, we can check it. But HOLD UP. Don't use Operation Name, like we did for the request. This will still have the request operation name - because it's part of that operation (e.g., /index). What you want here is the Name. This looks like “GET storage_account_name/table_name” and we can quickly check for it.

And that's it. You should get less noise in your telemetry and save some mulah.

A few things to note:

I used StartsWith("GET storage_account…") for the operation because I do not want ignore PATCH/POST operations. Since this could be an issue I need to know about. If I am trying to update an item that does not exit, I should probably have some code to do something about it.

Would getting rid of ignore case help performance? Maybe, I need to check. For now, though, better safe than sorry. Even though Azure storage is case sensitive, and this could never be something else that what we expect. But what if I write it wrong in one or the other? This will keep me safe.

 

Reference Material:

ITelemetryProcessor Interface (Microsoft.ApplicationInsights.Extensibility) - Azure for .NET Developers | Microsoft Docs

Filtering and preprocessing in the Application Insights SDK - Azure Monitor | Microsoft Docs

Implement App Insights Telemetry Processor in Azure Functions - Developer Support (microsoft.com)