Securing data and users with Django and the Django REST Framework

A scene from the Bayeux Tapestry depicting Bishop Odo rallying Duke William's army during the Battle of Hastings in 1066

With threats coming from everywhere, apps needing to defend against hostile actors and accidental mishaps alike, whilst still ensuring great user experiences and scaleable apps built at an ever faster rate, I offer this update to a previous article about using libraries to ensure data safety Why Bark when you have a Dog? for Python's Django and the REST framework that I use for a significant proportion of my work. Mainly citing Django as the REST framework works broadly the same way except it presents APIs rather than web pages and uses serializers instead of forms.

There are many aspects to robust apps and web services exposed to the internet so this article will only cover core parts of Django with respect to data integrity. You are advised to keep up to date with all the frameworks you use and follow good practice as they evolves.

The main interfaces Django communicates with are incoming via a web uri, a database, restricting users to their own data and data presentation.

There are also many protections surrounding the HTTP protocol to protect against hostile actors for both end users and web services including:

  • Content Security Policy;
  • Strict Transport Security;
  • X Frame Options;
  • Cross Site Request Forgery protection.

The golden rule of Web application security is to never trust data from untrusted sources. As users can make mistakes as well as be compromised, it's good practice to also check data from trusted sources.

Incoming URIs

Most users interact with a website by using a browser with a uri such as https://example.com/blog/2023/MyBlog/

Although a uri is always text, it's often the cause of errors. Fortunately Django's URL dispatcher has a useful ability to capture parameters using Path converters to safely validate and pass the converted value (int, uuid, etc) to a view.

urlpatterns = [
    path('articles/<int:blog_id>/', views.blog),
    path('articles/<int:year>/<int:month>/<slug:slug>/', views.article_detail),
    path('document/<uuid:gid>/', views.document),
]

Views can safely use integers, UUIDs or slugs as part of a url and then use them to call a view as per the following example:

def blog(request, blog_id):
    doc = Blog.objects.get(pk=blog_id)
    html = "<html><body>{}</body></html>".format(doc.document)
    return HttpResponse(html)

A more comprehensive example using a template:

document.html

<html>
<head></head>
<body>
<div>User: {{ first_name }} {{ last_name }}</div>
<h1>{{ doc.title }}</h1>
<div>{{ doc.body|safe }}</div>
</body>
</html>

And the view that uses it:

def document(request, doc_id):
    doc = Doc.objects.get(pk=doc_id)
    context = {
        'doc': doc,
    }
    return render(request, 'document.html', context)

Thanks to Django's URL dispatcher and Path converter, parameters sent from the URL to the view can be relied on to be the type specified in the urlconf, so doc_id is an integer.

HTML uses ordinary characters such as <p>Hello</p> to format a document into a paragraph saying Hello. By default, Django's template engine transforms special characters such as < into &lt; so they display properly and maintain document integrity.

When text is already correctly formatted as an html document, for example the document body above, it can be marked "safe" so it can be used as is without any transformations. Both CharField and TextField can hold any type of text so it's imperative when capturing user input that may be used as-is that it's sanitised for safety before it's saved. If you trust your users, then TinyMCE may be sufficient, if not then server side filtering may be a solution.

Communicating with a database

If part of a url - a slug - were to be used to search for a database record, then Django does the right thing and prevents sql injections.

doc = Doc.objects.get(slug=slug)

Using the above example, if search_str was text then again, Django does the right thing:

docs = Doc.objects.get(title__startswith=search_str)

What if you need to use your own sql to query the database? Raw sql can be used safely if you:

  • Ensure your statement is appropriate as no statement checking is done when using raw SQL;
  • Use placeholders and leave them unquoted to avoid SQL Injection attacks, Django does the right thing.

An example raw sql query:

docs = Doc.objects.raw('SELECT * FROM myapp_document WHERE title = %s', [search_str])

Warning: For MySQL like database engines that do silent type coercion, ensure data types are either correctly matched or typecast before using the value in a query to avoid unexpected results. For example, if your table contains the values 'abc', 'def' and you query for WHERE mycolumn=0, both rows will match.

Restricting users to their own data

Information about the user is passed to every view with every request and can be checked with request.user.is_authenticated.

It's easy to ensure users only get access to their own data:

doc = Doc.objects.get(pk=doc_id, user=request.user)

It gets more complicated if a group of users are authorised to access a subset of data, but the principle remains the same. Use the request user to check against any intermediate tables such as organisation or group.

If you're only interested in accessing features by module, then Django's built-in permissions system may be useful.

The Django REST framework offers oAuth2 authentication and returns the same request.user as Django and depending on the authentication policy may also include request.auth.

Data validation

If you accept user input, then forms and serializers will make it easier and safer both linking form to data models and validating that data. The raw data is also available in request.GET, request.POST or for DRF request.data though this should be treated with care.

Data presentation

Templating in Django makes it easy presenting data safely to users.

As long as templates are secure, users can only see what's presented to them. Nevertheless it's good practice to only expose minimum information to the template system.

If a template were inadvertently altered it could expose system sensitive passwords and user information:

Don't do this!

context = {
    'settings': settings,
    'user': request.user,
}
return render(request, 'myapp/index.html', context)

Instead restrict the information sent to the template:

context = {
    'first_name': request.user.first_name,
    'last_name': request.user.last_name,
    'email': request.user.email,
    'site_title': settings.SITE_TITLE,
}
return render(request, 'myapp/index.html', context)

Summary

Django, the REST framework and Python offer many tools and techniques to quickly build safe and easy to use apps. Threats evolve as do the tools and frameworks we use. Keeping up to date, incorporate good practice, using automated and manual tools such as Django built in checks, linters, automated test tools are an essential part of keeping users and data safe.

Stay safe, stay ahead.