Views and Stored Procedures in PostgreSQL - What are they and when to use them for increasing user and database efficiency with examples

Looking for an edge delivering more data insights? This article is about easy ways to empower users by implementing database views and stored procedures, offering insights into how and why, their respective advantages, plus real life examples to get even greater efficiency for your users. Demonstrating use cases for views and stored procedures, with comparison of features. Based on PostgreSQL, it's applicable to any database with support for views and procedures.

Time is an illusion. Lunchtime doubly so.
Douglas Adams in The Hitchhiker's Guide to the Galaxy

Before diving in, let's define what we mean by efficiency

Efficient adj: Making, Causing to be; Productive of effects; Adequately skilled and active.

In this article, efficiency can mean:

  • Efficiency of people's time - fast and responsive;
  • Efficiency of people's time - ease of use and re-use;
  • Efficiency of machine resources;
  • Scalability.

Worthy of note

"Efficiency of people's time" has two meanings, one that produces results quickly, "responsive", and one that although often slower, is easier to learn, use and re-use.

"Efficiency of resources", especially when tempered with "User Experience" often leads to significant "Efficiency of people".

"Scalability" is one step further in ensuring that what works for a few, can be made to work for the many.

There is a natural tension between all these meanings of efficiency, so any technical decision is best made in conjunction with clearly defining and balancing business objectives.

Business context

To find that edge, your staff need easy access to data! Making data accessible via Views or Procedures, offers easier ways to discover more about your users, products and services and reduce the time it takes to surface new opportunities.

With an imperative to continually ask questions and reduce time to market, it's essential to make access to information easy and intuitive.

Individual transactions either for a sales force or website, responses must be sub second.

Ad-hock reports, when predefined / packaged can benefit end users with familiarity and ease of use with, for example, Microsoft Access, Excel or LibreOffice Base, Calc.

Reports can be prepared ahead of time such as "How many products did we sell yesterday?", or immediate "What's my commission on that last sale?".

Making core or complex information easily available, reduces both barriers to research and time to market delivering that competitive edge.

Technical choices

With ample material elsewhere on optimising and organising data, that aspect is outside the scope of this article.

Views are a way of packaging up complex queries into something more manageable. There are many uses for views, this article focuses on benefits including performance, avoiding re-inventing common queries, aiding best practice and increasing productivity through re-use and understanding. Views can be used with other tables and views and are usually intended to be filtered and combined in a flexible manner.

Procedures, sometimes called stored procedures are highly optimised and can include complex business logic. They are often used as discrete components, though can be combined with other procedures, queries and views.

Although clean separation of responsibilities can suggest data be stored on a database and all processing done in an application, there are clear performance and ease of use advantages with database side processing. This article demonstrates and offers insight into those advantages.

Database views

As their name suggests, they are a packaged view into data. Often combined with other views and tables, they are flexible and can filter, join, manipulate and sort data. Great for generic or specific queries, perfect for packaging up common queries, with reduced complexity, making use and re-use easy. The following example is taken from a schema used in an open source project to manage DMARC feedback reports.

DMARC is a standard offering additional security for emails. Django-DMARC helps with implementation by capturing and reporting on feedback reports. The tables are optimised for the data they contain and consist of a reporter, a DMARC feedback record, and a SPF and DKIM results table aliased here to spf_dmarc_result and dkim_dmarc_result.

Django DMARC schema

The majority of reports use just one query to filter and present data, it makes sense to package to a view for ease of use and re-use.

The following is the code to create the view. As can be seen, the query is complex and some columns have been re-named to improve user experience.

SQL to create a view

CREATE VIEW view_dmarc_report AS
SELECT
  dmarc_reporter.org_name,
  dmarc_reporter.email,
  dmarc_report.date_begin,
  dmarc_report.date_end,
  dmarc_report.policy_domain,
  dmarc_report.policy_adkim,
  dmarc_report.policy_aspf,
  dmarc_report.policy_p,
  dmarc_report.policy_sp,
  dmarc_report.policy_pct,
  dmarc_report.report_id,
  dmarc_record.source_ip,
  dmarc_record.recordcount,
  dmarc_record.policyevaluated_disposition,
  dmarc_record.policyevaluated_dkim,
  dmarc_record.policyevaluated_spf,
  dmarc_record.policyevaluated_reasontype,
  dmarc_record.policyevaluated_reasoncomment,
  dmarc_record.identifier_headerfrom,
  spf_dmarc_result.record_type AS spf_record_type,
  spf_dmarc_result.domain AS spf_domain,
  spf_dmarc_result.result AS spf_result,
  dkim_dmarc_result.record_type AS dkim_record_type,
  dkim_dmarc_result.domain AS dkim_domain,
  dkim_dmarc_result.result AS dkim_result
FROM dmarc_reporter
INNER JOIN dmarc_report
ON dmarc_report.reporter_id = dmarc_reporter.id
INNER JOIN  dmarc_record
ON dmarc_record.report_id = dmarc_report.id
LEFT OUTER JOIN dmarc_result AS spf_dmarc_result
ON spf_dmarc_result.record_id = dmarc_record.id
AND spf_dmarc_result.record_type = 'spf'
LEFT OUTER JOIN dmarc_result AS dkim_dmarc_result
ON dkim_dmarc_result.record_id = dmarc_record.id
AND dkim_dmarc_result.record_type = 'dkim';

Using a view

Using a view is incredible simple, for example to obtain a summary of dispositions in January, simply use the following sql:

SELECT   policyevaluated_disposition, count(*)
FROM     view_dmarc_report
WHERE    date_begin >= '2017-01-01'
AND      date_end < '2017-02-01'
GROUP BY policyevaluated_disposition
ORDER BY policyevaluated_disposition;

The complexity remains out of the way in the view, any unused columns and tables are further optimised by the database to return just the information required.

By packaging up complex queries into a view they are easier to use. There are dozens of features and an infinite number of reasons to use them, see Resources to explore further.

Comments

Highly efficient query creating a flat representation of the dmarc feedback record.

As an end user request, any extra tables, views, joins and filters are sent to the database for processing where it will be optimised, returning only the requested rows and columns.

View still too slow/complex? PostgreSQL offers "Materialised views" with flexible data caching providing significantly faster access.

End users can use, re-use, filter, slice and dice the data as required using familiar tools such as Microsoft Excel and Access, and Libreoffice Calc and Base etc.

Cost of developing this view was minimal. It is almost re-creating and directly representing the feedback record.

Procedures and complex functions in PostgreSQL

When requirements are complex or need additional processing, procedures can be ideal. A complete programming language in their own right, and in this article native PostgreSQL PL/pgSQL. Most major databases include choices for programming languages so you can use whichever best fits your requirements; for a list/comparison see resources. With very close access to data they are incredibly fast. Closely tied to data, it can make sense investing in optimisation to process information through procedures, presenting output in a more easily digested format.

Highly optimised, stored procedures are usually used as discrete components, though they can be combined with other tables and views if required. Often used to generate specific reports or data sets, they are frequently used in single data panels or as a focused aspect to a report combined with other data. They can be easily consumed by Microsoft Excel, Access, Libreoffice Calc, Base and similar tools.

In this article's example, bookings for a room, apartment or facility are calculated to provide utilisation analysis by month for each unit. The output is presented as follows.

UnitJFMAMJJASOND
Unit 1 22 18 24 24 26 28 24 20 18 22 24 24
Unit 2 24 26 22 26 24 26 18 20 22 24 26 28
Unit 3 27 22 24 29 23 18 15 17 16 18 16 14

The database schema is straightforward with two tables, one for units, and another for bookings.

Although conceptually simple, getting at the data requires some calculations that are easier to program into a client side app or in this example a server side stored procedure. The desired outcome is to tabulate the number of booked days by month. As can be seen from the example stored procedure or PostgreSQL function that follows, it takes a year as a parameter, defines what will be returned, creates a few variables, a temporary table, then populates the temporary table with data for each month, then returns the final output table joined with unit information, in this case just a title.

Comments

Highly efficient stored procedure/function creating a tabulated representation of unit utilisation by month for a year.

As all the processing is close to the data, it is very fast. Unlike a view, all the data will be processed regardless of any WHERE filters, though the returned data can be filtered.

This function was designed for output to a dashboard, end users can re-use, filter, slice and dice the output as required using common data imports from Microsoft Excel, Access, Libreoffice Calc and Base etc.

Cost of designing this procedure is greater due to the complexity and additional testing.

SQL to create a PostgreSQL function (stored procedure)

CREATE OR REPLACE FUNCTION public.sp_utilisation_unit(year integer)
  RETURNS TABLE(
    id integer,
    title character varying,
    days_01 integer, days_02 integer, days_03 integer,
    days_04 integer, days_05 integer, days_06 integer,
    days_07 integer, days_08 integer, days_09 integer,
    days_10 integer, days_11 integer, days_12 integer
  ) AS
$BODY$
DECLARE
  d_start date := '1970-01-01';
  d_end date := '1970-01-01';
  d_end_calc date := '1970-01-01';
  i_month integer := 0;
BEGIN
  BEGIN
    -- Create temp table to hold utilisation
    DROP TABLE IF EXISTS temp_utilisations;
    CREATE TEMP TABLE temp_utilisations (
      id integer,
      month integer,
      days integer
    ) ON COMMIT DROP;
    END;
    BEGIN
    FOR i_month IN 1..12 LOOP
      -- Calculate dates
      -- d_start = Beginning of the month
      SELECT make_date(year, i_month, 1)
      INTO d_start;
      -- d_end = End of the month
      SELECT make_date(year, i_month, 1) + interval '1 month' - interval '1 day'
      INTO d_end;
      -- d_end_calc = Beginning of the next month
      -- so that checkout - checkin works
      SELECT make_date(year, i_month, 1) + interval '1 month'
      INTO d_end_calc;
      -- Select and save monthly data
      INSERT INTO temp_utilisations (
        id,
        month,
        days)
      SELECT
      b.unit_id,
      i_month,
      sum(CASE
        WHEN b.checkout > d_end THEN d_end_calc
        ELSE b.checkout
        END -
        CASE
        WHEN b.checkin < d_start THEN d_start
        ELSE b.checkin
        END) AS days
      FROM my_booking AS b
      WHERE b.checkin BETWEEN d_start AND d_end
      OR    b.checkout BETWEEN d_start AND d_end
      GROUP BY b.unit_id;
    END LOOP;
    END;
    RETURN QUERY
      SELECT a.id, a.title,
      CAST(sum(u1.days) AS integer) AS days_01,
      CAST(sum(u2.days) AS integer) AS days_02,
      CAST(sum(u3.days) AS integer) AS days_03,
      CAST(sum(u4.days) AS integer) AS days_04,
      CAST(sum(u5.days) AS integer) AS days_05,
      CAST(sum(u6.days) AS integer) AS days_06,
      CAST(sum(u7.days) AS integer) AS days_07,
      CAST(sum(u8.days) AS integer) AS days_08,
      CAST(sum(u9.days) AS integer) AS days_09,
      CAST(sum(u10.days) AS integer) AS days_10,
      CAST(sum(u11.days) AS integer) AS days_11,
      CAST(sum(u12.days) AS integer) AS days_12
      FROM my_units AS u
      LEFT JOIN temp_utilisations as u1
      ON u1.id = u.id AND u1.month = 1
      LEFT JOIN temp_utilisations as u2
      ON u2.id = u.id AND u2.month = 2
      LEFT JOIN temp_utilisations as u3
      ON u3.id = u.id AND u3.month = 3
      LEFT JOIN temp_utilisations as u4
      ON u4.id = u.id AND u4.month = 4
      LEFT JOIN temp_utilisations as u5
      ON u5.id = u.id AND u5.month = 5
      LEFT JOIN temp_utilisations as u6
      ON u6.id = u.id AND u6.month = 6
      LEFT JOIN temp_utilisations as u7
      ON u7.id = u.id AND u7.month = 7
      LEFT JOIN temp_utilisations as u8
      ON u8.id = u.id AND u8.month = 8
      LEFT JOIN temp_utilisations as u9
      ON u9.id = u.id AND u9.month = 9
      LEFT JOIN temp_utilisations as u10
      ON u10.id = u.id AND u10.month = 10
      LEFT JOIN temp_utilisations as u11
      ON u11.id = u.id AND u11.month = 11
      LEFT JOIN temp_utilisations as u12
      ON u12.id = u.id AND u12.month = 12
      GROUP BY u.title, u.id
      ORDER BY u.title;
END;$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100
  ROWS 1000;
 

Using a PostgreSQL procedure in Django

Consuming this data is very easy.

The following Django function consumes the stored procedure, then calls a template for presentation

def utilisation_unit_year(request, year=None):
    """Show apartment utilisation"""
    if year is None:
        year = date.year
    # Go fetch utilisation
    sql = 'SELECT * FROM sp_utilisation_unit(%s);'
    cursor = connection.cursor()
    cursor.execute(sql, [year])
    data = cursor.fetchall()

    context = {
        "data": data,
    }
    return render(request, 'myapp/utilisation.html', context)

The following template snippet from utilisation.html is all that is required for presentation.

    <table>
    {% for d in  data %}
    <tr>
    {% for c in  d %}
    <td>{% if c  %}{{ c }}{% else %}-{% endif %}</td>
    {% endfor %}
    </tr>
    {% endfor %}
    </table>

Summary

It's essential to clearly define requirements and objectives.

Making core or complex information easily available, reduces both barriers to research and time to market.

Optimising frequently used or time specific information can make a difference to sales feedback and incentives with significant results.

Views tend to be used for packaging core or complex queries into something easily more easily used and re-used than original queries. These views can be both generic and optimised making them ideal for both regular and ad-hock requests.

Procedures are, by their nature more complicated to produce, they are a great way to process complex data incredibly fast at source, and making end presentation easier.

Views and Procedures add simplicity and ability to extend databases making them faster and easier to use and re-use than ever. There are many technical and business uses, and the reader is encouraged to further explore the listed resources.

Feature comparisonViewsProcedures
Ease of use Easy Easy – moderate
Efficiency High High
Development cost Moderate High
Task specific Moderate High
Scope for optimisation Possible Limited
Use with views Yes Yes
Use with procedures Yes Yes

This feature comparison is intended as a guide, actual results depend on implementation. If in doubt, measure and test!

PostgreSQL is a freely available open source object relational database management system. Using SQL the structured query language used by most databases to create views and the many procedural languages to create user defined functions, it's possible to achieve almost anything. Other databases such as Oracle, IBM DB2, Microsoft SQL Server, Sybase and MySQL are just some of the many databases making it easier to surface information in your data.