Let's examine an Azure Function that takes a string input request and subsequently serves that same data to the web for rendering. Assume the text below is the input:
"<img src=x onerror=\"confirm('System Compromised')\">"
Without any sanitization being done, this will cause the page to execute this JavaScript instead of rendering its content. Here lies the crux of XSS vulnerability, because if the scripts are more complex, they can cause an untold number of consequences.
The first step in preventing XSS-formatted data is to validate all incoming inputs. In this blog, I will explore two types of validations for detecting unsafe input data: Model Validation using Data Annotations and Fluent Validation. I will demonstrate these techniques using Azure Functions version 4 running on .NET 7.
Model Validation:
This validation method utilizes .NET's Data Annotations from the System.ComponentModel.DataAnnotations
namespace to set validation rules for model properties.
The first step is to add the DataAnnotations package to your solution. You can do this either via the Nuget Package in Visual Studio or by using any of the command lines found here.
Although Data Annotations provide some basic property validation out of the box, we will create our own custom attribute validator for more flexibility in detecting non-XSS-format data. We will develop a class called XssValidatorAttribute
, which inherits from the base class ValidationAttribute
. As part of this implementation, we must override the methods IsValid()
and FormatErrorMessage()
. To keep this example simple, we will define non-XSS-format data as containing only numbers, letters, and spaces.
using System.ComponentModel.DataAnnotations;
using System.Globalization;
using System.Text.RegularExpressions;
namespace DataSanitization.Validators
{
[AttributeUsage(AttributeTargets.Property | AttributeTargets.Field, AllowMultiple = false)]
sealed public class XssValidatorAttribute : ValidationAttribute
{
public override bool IsValid(object obj)
{
bool result = true;
return result;
}
public override string FormatErrorMessage(string name)
{
return String.Format(CultureInfo.CurrentCulture,
ErrorMessageString, name);
}
}
Next, we want to add a private method IsXssSafe()
, which checks the content of the property for any XSS-formatted data, such as HTML or JavaScript-formatted data.
internal bool IsXssSafe(string? input)
{
if (string.IsNullOrEmpty(input))
return false;
//custom rule: expect letters,spaces and number only.
Regex rx = new Regex(@"^[A-Za-z0-9 ]+$");
return rx.IsMatch(input);
}
This method can now be invoked as part of the IsValid()
function.
public override bool IsValid(object obj)
{
bool result = true;
if (obj != null && obj is string)
{
var input = obj.ToString();
result = IsXssSafe(input);
}
return result;
}
Next, we create our test model, which represents the input JSON from the API call. This is a standard class that represents an object, and you can use Data Annotation via the property's attribute to define any validation you wish.
In order to validate against the XSS format, we use the newly created custom attribute XssValidator
and assign it a failure case scenario message.
using System.ComponentModel.DataAnnotations;
using DataSanitization.Validators;
namespace DataSanitization.Models
{
public class TestAnnotationModel
{
[Required(ErrorMessage = "Requires Id")]
[StringLength(100,MinimumLength = 2)]
[XssValidator(ErrorMessage = "Xss character(s) detected")]
public string Input { get; set; }
}
}
Finally, we can now use the custom validator to examine each request input. We will implement the HTTP Trigger Azure function called ValidateUsingModelValidation
. It returns the input data if no XSS format is detected and displays a validation error if an XSS format is found.
To invoke the data validation, we call the function Validator.TryValidateObject()
, which is part of the DataAnnotation namespace, and pass in the input data once we have deserialized it from the input request.
using System.ComponentModel.DataAnnotations;
using System.Net;
using DataSanitization.Models;
using Microsoft.Azure.Functions.Worker;
using Microsoft.Azure.Functions.Worker.Http;
using Microsoft.Extensions.Logging;
namespace DataSanitization.Functions
{
public class ValidateUsingModelValidation
{
private readonly ILogger _logger;
public ValidateUsingModelValidation(ILoggerFactory loggerFactory)
{
_logger = loggerFactory.CreateLogger<ValidateUsingModelValidation>();
}
[Function(nameof(ValidateUsingModelValidation))]
public async Task<HttpResponseData> Run([HttpTrigger(AuthorizationLevel.Anonymous, "get", "post")] HttpRequestData req)
{
var response = req.CreateResponse(HttpStatusCode.OK);
//Deserialize input request.
var dataModel = await req.ReadFromJsonAsync<TestAnnotationModel>();
try
{
var results = new List<ValidationResult>();
//validate
var isValid = Validator.TryValidateObject(
instance: dataModel!,
validationContext: new ValidationContext(dataModel!),
validationResults: results,
validateAllProperties: true);
//Return response.
if (isValid) {
await response.WriteAsJsonAsync(dataModel);
}
else
{
await response.WriteAsJsonAsync(results);
}
}
catch (Exception ex)
{
await response.WriteAsJsonAsync(ex.Message);
}
return response;
}
}
}
When testing this with Postman, if our input contains XSS-formatted characters, HTML, or JavaScript, the program will return an error message.
In a similar test, when using normal input data, the Azure function generates a 200 response and returns the appropriate response data.
Fluent Validation:
We will implement the exact same validation logic, but we will utilize the FluentValidation package.
Unlike Data Annotation, we have a model, but you will not see any sort of validation-related data in it. That is one of the nice things about Fluent; it keeps the validation logic separate from the data object.
namespace DataSanitization.Models
{
public class TestFluentModel
{
public string Input { get; set; }
}
}
Next, we will create a custom XSS validator class called XssFluentValidator
. This class should inherit from the AbstractValidator
class provided by Fluent.
using DataSanitization.Models;
using FluentValidation;
namespace DataSanitization.Validators
{
public class XssFluentValidator : AbstractValidator<TestFluentModel>
{
public XssFluentValidator()
{
RuleFor(x => x.Input)
.NotEmpty()
.WithMessage("Requires Id")
.MaximumLength(100)
.MinimumLength(2)
.Matches(@"^[A-Za-z0-9 ]+$") //detect XSS format.
.WithMessage("Xss character(s) detected");
}
}
}
With Fluent Validation, we can inject our newly created validator using any dependency injection library available, such as the native one provided by Microsoft.Extensions.DependencyInjection
.
However, I chose to use the extension library that comes with the Fluent package called DependencyInjectionExtensions
in the FluentValidation
namespace. Its AddValidatorsFromAssemblyContaining
method offers automatic registration, significantly reducing the scaffolding work. For more information on how to use this, you can visit this link.
Program.cs.
using DataSanitization.Validators;
using FluentValidation;
using Microsoft.Azure.Functions.Worker;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
var host = new HostBuilder()
.ConfigureFunctionsWorkerDefaults()
.ConfigureServices(services =>
{
services.AddApplicationInsightsTelemetryWorkerService();
services.ConfigureFunctionsApplicationInsights();
//automatic registration for custom validator
services.AddValidatorsFromAssemblyContaining<XssFluentValidator>(ServiceLifetime.Transient);
})
.Build();
host.Run();
At this point, our custom validator has been created and registered, and it is ready to be used through dependency injection wherever appropriate. In our case, we will employ this in an HTTP Triggered Azure function, similar to how we implemented the Data Model validation approach.
using System.Net;
using DataSanitization.Models;
using Microsoft.Azure.Functions.Worker;
using Microsoft.Azure.Functions.Worker.Http;
using Microsoft.Extensions.Logging;
using FluentValidation;
namespace DataSanitization.Functions
{
public class ValidateUsingFluentValidation
{
private readonly ILogger _logger;
private readonly IValidator<TestFluentModel> _validator;
public ValidateUsingFluentValidation(ILoggerFactory loggerFactory, IValidator<TestFluentModel> validator)
{
_logger = loggerFactory.CreateLogger<ValidateUsingFluentValidation>();
_validator = validator;
}
[Function(nameof(ValidateUsingFluentValidation))]
public async Task<HttpResponseData> Run([HttpTrigger(AuthorizationLevel.Anonymous, "get", "post")] HttpRequestData req)
{
var response = req.CreateResponse(HttpStatusCode.OK);
// Deserialize object
var dataModel = await req.ReadFromJsonAsync<TestFluentModel>();
//Validating using Fluent validation
var validationResult = await _validator.ValidateAsync(dataModel!);
if (!validationResult.IsValid)
{
await response.WriteAsJsonAsync(validationResult.Errors.ToDictionary(x => x.PropertyName, x => x.ErrorMessage));
response.StatusCode = HttpStatusCode.BadRequest;
}
else
{
await response.WriteAsJsonAsync(dataModel);
}
return response;
}
}
}
Using Postman to make a similar API call, we can achieve a comparable result:
Model Validation vs Fluent Validation:
While data annotation is still widely used, such as in MVC-based projects or with machine learning algorithms, Fluent Validation has been gaining traction due to its flexible conditional validation on properties and its unit test-friendly approach.
Overall, I find myself having an easier time writing more complex validations with Fluent Validation than with Data Annotation. The biggest advantage I find is the ability to separate validation rules from models or entities in my projects and the ease of writing unit tests.
In my opinion, data annotation excels at:
Superb support for client-side validation without needing to repeat validation rules.
Condensing all validation rules in one place.
While Fluent Validation excels at:
Conditional validation required on multiple properties.
Offering more fine-grained control of validation rules.
Separating models and their validation rules.
Making unit testing easier to write, and working with mocking a breeze.
Presenting validation syntax in a more human context, which in turn makes the code clearer and easier to understand.
Data Encoding:
I would be remiss not to mention this crucial aspect. In many situations where user experience is a priority, we encode the data instead of rejecting the input. In these cases, we have no choice but to accept the data.
Nonetheless, it is our responsibility to encode it before saving or even returning it as a response.
I'll demonstrate how we utilize FluentValidation
to detect XSS-format data and then encode it to a safer format before returning the response or saving it to a database.
Microsoft offers a handy method called AddWebEncoders()
within the package Microsoft.Extensions.WebEncoders
. It adds the HTMLEncoder
, JavaScriptEncoder
, and UrlEncoder
to the service container, which can then be dependency injected into any part of your code.
To do so, we only need to install the package and register it with our services in program.cs
//Add HtmlEncoder, JavaScriptEncoder and UrlEncoder to the service container
services.AddWebEncoders();
Next, we inject these encoders as dependencies into the Azure Function where we want to use them.
private readonly ILogger _logger;
private readonly HtmlEncoder _htmlEncoder;
private readonly UrlEncoder _urlEncoder;
private readonly JavaScriptEncoder _javascriptEncoder;
public EncodeInputData(
ILoggerFactory loggerFactory,
HtmlEncoder htmlEncoder,
UrlEncoder urlEncoder,
JavaScriptEncoder javaScriptEncoder)
{
_logger = loggerFactory.CreateLogger<EncodeInputData>();
_htmlEncoder = htmlEncoder;
_urlEncoder = urlEncoder;
_javascriptEncoder = javaScriptEncoder;
}
Now, within the function, we can encode any data we choose. In this specific case, all I did was check if the input contained XSS-formatted data, and then I encoded it before sending back the response.
[Function(nameof(EncodeInputData))]
public async Task<HttpResponseData> Run([HttpTrigger(AuthorizationLevel.Anonymous, "get", "post")] HttpRequestData req)
{
var response = req.CreateResponse(HttpStatusCode.OK);
//Deserialize input request.
var dataModel = await req.ReadFromJsonAsync<TestAnnotationModel>();
try
{
var results = new List<ValidationResult>();
//validate
var isValid = Validator.TryValidateObject(
instance: dataModel!,
validationContext: new ValidationContext(dataModel!),
validationResults: results,
validateAllProperties: true);
//Return response.
if (isValid) {
await response.WriteAsJsonAsync(dataModel);
}
else
{
//encode data
var jsonString = JsonSerializer.Serialize(dataModel);
var encodedRespone = _htmlEncoder.Encode(jsonString);
await response.WriteAsJsonAsync(encodedRespone);
}
}
catch (Exception ex)
{
await response.WriteAsJsonAsync(ex.Message);
}
return response;
}
The Postman HTTP POST call shows the response as fully encoded, making any script or HTML ineffective in enabling an XSS attack.
Conclusion:
In this blog, I've demonstrated two methods for validating your input data against XSS attacks. Based on the validation results, we can either reject the request immediately or encode the content of the malicious data to safely serve it back to the caller or pass it on to the next component in our application.