Flask

From Training Material
Revision as of 10:57, 28 February 2018 by Apatrakov (talk | contribs) (→‎File upload security ⌘)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
title
Web development with Flask
author
Alexander Patrakov

Let's start from the basics ⌘

  • WSGI
  • Purpose of web frameworks
  • Available web frameworks
  • Place of Flask in the ecosystem

HTTP protocol ⌘

  • HTTP/1.1 is defined by RFC 7230 - RFC 7235
  • Browsers are the most popular clients
  • There are non-browser clients
  • Web servers deliver HTML, images, JS, CSS, downloadable files, other content types

HTTP Request ⌘

GET /html/rfc7240 HTTP/1.1
Host: tools.ietf.org
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.90 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding: gzip, deflate, sdch, br
Accept-Language: en-US,en;q=0.8,ru;q=0.6
Cookie: __cfduid=d105aee619396e6187d19c60cb032f4b71479467417

HTTP Response ⌘

HTTP/1.1 200 OK
Server: Apache/2.2.22 (Debian)
Last-Modified: Sun, 13 Nov 2016 10:39:20 GMT
ETag: "3cc378-c260-5412c55f8017f;54191604efade"
Accept-Ranges: bytes
Cache-Control: max-age=604800
Expires: Fri, 25 Nov 2016 11:15:54 GMT
Strict-Transport-Security: max-age=3600
Content-Length: 12952
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8
<other headers omitted for brevity>

<!DOCTYPE html ...

Python web applications ⌘

  • WSGI is the interface between the web server and the web application
  • The web server calls a callable for every request
    • usually "application", but the name is configurable
  • It should take two parameters: environ and start_response
    • environ is a dict with CGI-like parameters
      • also contains the "input" stream used for POST content
    • start_response(status, response_headers, exc_info=None) is a callback
      • should be used by the application to set the response headers and status code
      • should return an iterable which supplies pieces of content
  • WSGI is not good for websockets!

Simplest Python web application ⌘

(use it with Apache's mod_wsgi, or run from command line)

#!/usr/bin/python2

from wsgiref.simple_server import make_server

def application(environ, start_response):
    status = '200 OK'
    output = 'Hello World!'

    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(len(output)))]
    start_response(status, response_headers)

    return [output]

if __name__ == "__main__":
    httpd = make_server('', 8000, application)
    httpd.serve_forever()

Exercise: try the raw WSGI ⌘

  • Can you print the complete WSGI environment to the browser?
  • Can you handle two different URLs? ("Hello world" and environment)

Tasks of a web framework ⌘

  • Routing
  • Sessions
  • Response formatting
  • Input validation
  • Database communication
  • Structuring your application

Available web frameworks ⌘

  • Batteries included
    • Self-contained and comprehensive
    • Examples: Django, TurboGears, web2py
  • Best of breed
    • Just a thin layer of glue between other libraries
    • Examples: Flask, Pyramid, CherryPy, Bottle

Flask as compared to other web frameworks ⌘

  • Minimalistic
  • Commonly used for APIs and single-page applications
    • Traditional web applications with forms are also possible
  • Non-MVC by default - does not supply the model layer
    • But you can support MVC in your application
  • Modern and actively developed
  • Relies on established technologies via third-party libraries
  • Does not constrain the developer in terms of application architecture

Flask criticisms ⌘

  • You are the only architect of your application
    • There is no scaffolding
    • With great power comes great responsibility
  • Many extensions of varying quality and fitness for your purpose
    • Sometimes it is easier to code the necessary functionality from scratch
  • Sad story about user authentication

Installation ⌘

  • Flask is best installed into a virtualenv
virtualenv3 flask-env
cd flask-env
. bin/activate
pip install flask
  • Also included as a package in Debian and Ubuntu
    • Likely outdated, but still works
    • Not recommended unless you absolutely need something that never changes
  • Flask supports both Python2 and Python3

Flask is a glue ⌘

  • Glues together:
    • Werkzeug for HTTP request and response abstractions, as well as interactive debugger
    • Jinja2 for templating system
    • Itsdangerous for secure cookies (used for sessions)
  • Provides:
    • Structuring your application into a set of blueprints
    • URL routing for blueprints
    • Error handling
    • Configuration system

Hello World ⌘

  • Save as app.py
  • FLASK_APP=app.py flask run
from flask import Flask

application = Flask(__name__)

@application.route('/')
def index():
    return u'Hello, World!'

Return value of a view function ⌘

  • Any of the following:
    • A ready-made flask.Response object
      • Use make_response to create it, and then you can post-process it
    • Unicode string
      • Will be converted to UTF-8 and served as text/html
    • Byte string
      • Will be served as text/html
    • A tuple of (response, status, headers) or (response, headers)
    • A WSGI application, ready to be called

More routing ⌘

  • View functions can take arguments extracted from the URL path
  • Example:
@app.route('/news/<slug>')
def news(slug):
    pass  # slug is a string that must not contain slashes
  • It is also possible to convert some types like int:
@app.route('/user/<int:userid>')
def show_user(userid):
    pass  # userid is an integer

Acceptable converters ⌘

  • General syntax: <type(args):variable_name>
    • string, int, float, uuid: do the obvious thing
      • string is just the default converter
      • UUIDs stay as strings, but now you can be sure they are valid
    • path: string that can contain slashes
    • any(foo,bar.baz): accepts only foo, bar or baz
  • See Werkzeug documentation for all acceptable parameters

About trailing slashes ⌘

  • By default, if you include a trailing slash in your route, Flask will generate a redirect if it sees the same URL without a slash
  • To turn off:
app = Flask(__name__)
app.url_map.strict_slashes = False

Flask extensions ⌘

  • Extensions add support for common tasks (like connecting to a database)
    • Good question: why do we need an extension?
    • Answer: boilerplate code that is reduced. E.g. connecting to the database at the beginning of each request.
  • Here is an example how to use:
from flask import Flask
from flask_peewee.db import Database

# General pattern:
# 1. Create app
app = Flask(__name__)
# 2. Add configuration specific to the extension
app.config['DATABASE'] = {
    'name': 'example.db',
    'engine': 'peewee.SqliteDatabase',
}
# 3. Instantiate the extension, passing app as a parameter
db = Database(app)

Common extensions ⌘

  • Internationalization: Flask-BabelEx
  • Dealing with file uploads: Flask-Uploads
  • Processing forms: Flask-WTF
  • Interfacing with databases: Flask-SQLAlchemy and Flask-Peewee
  • Creating REST-like APIs: Flask-RESTful (general case), Flask-Restless (for SQLAlchemy-managed database records)
  • Enough for now

Do we need an extension to use something? ⌘

  • Good question!
  • Case study: Flask-WTF
    • WTForms are framework-neutral
    • Flask-WTF connects Flask-specific Request class to WTForms
      • so WTForms can get the data from there automatically
    • Same for file uploads
  • Case study: Flask-Peewee
    • Written by the same author as the Peewee ORM itself
    • The Peewee page documents the recommended way to use Peewee from Flask
      • Without the extension!
      • All you need is to connect to the database at the beginning of the request and disconnect at the end
      • The extension includes some extra functionality, but this extra functionality is deprecated

Templates ⌘

  • Templates are incomplete HTML pages, which require post-processing
    • values have to be inserted
    • pieces such as table rows have to be repeated
    • some parts may be displayed conditionally
    • there is a hierarchy of inheritance
  • Templates live in the templates directory

Rendering templates ⌘

  • At the top of the file: from flask import render_template
  • In the view function: flask.render_template(template_name_or_list, **context)
    • context provides extra variables for the template to use
    • request, g, session and app.config (as config) are always available

Template syntax ⌘

  • Jinja2 is the underlying engine
  • Write whatever HTML code that is needed - it will be delivered to the browser
  • To display a variable, write {{ variable }}
    • attribute access and array element access also works
    • some functions (e.g. url_for) are also available
  • Conditional printing: {% if foo %} ... {% else %} ... {% endif %}
  • Looping: {% for a in b %} ... {{ a }} ... {% endfor %}
  • There are macros
    • Equivalent of functions in other languages

Macros ⌘

  • Best explained via example from Flask-Security
  • In _macros.html:
{% macro render_field_with_errors(field) %}
  <p>
    {{ field.label }} {{ field(**kwargs)|safe }}
    {% if field.errors %}
      <ul>
      {% for error in field.errors %}
        <li>{{ error }}</li>
      {% endfor %}
      </ul>
    {% endif %}
  </p>
{% endmacro %}
  • In other templates:
{% from "security/_macros.html" import render_field_with_errors %}
{{ render_field_with_errors(login_user_form.email) }}
{{ render_field_with_errors(login_user_form.password) }}
{{ render_field_with_errors(login_user_form.remember) }}

XSS attacks ⌘

  • XSS = Cross-Site Scripting
  • Interpretation of malicious data as HTML (or, worse, JavaScript) by the browser
    • Reflected XSS: data from the query string are interpreted as HTML
    • Stored XSS: data from the database are interpreted as HTML
  • Typical exploitation scenario: write a "blog post" that also sends document.cookie to attacker when viewed by admin
    • Admin's session is now hijacked!
  • Solution: convert data from text to HTML
    • Also known as "escaping"

CSRF attacks ⌘

  • CSRF = Cross-Site Request Forgery
    • A hacker makes a form on his website
    • The button says "View my photos"
    • The form actually submits to your bank (in hope that you are logged in) and asks it to transfer $100 to the hacker
  • WTForms protect you against CSRF
    • They include a hidden field in all forms and check it against a cookie
    • The hacker can't see or modify the cookie, so can't guess what should be in the hidden field
    • Don't forget to render the CSRF protection field: {{ form.hidden_tag() }}

How variables are escaped ⌘

  • Autoescape is on for all templates with .html, .htm, .xml and .xhtml extensions
  • With autoescape:
    • If it is an object with an __html__ attribute, then this attribute is called as a function, and the result is used as-is
    • Otherwise, the object is converted to unicode and then the result is escaped
    • The logic is in the markupsafe.escape function
    • If you already have a piece of HTML in a string, make a Markup object from it and print it
      • Use unicode for text (that needs escaping) and Markup for HTML fragments (that don't need escaping)
    • Alternatively, use this inside the template: {{ something|safe }}
  • Without autoescape, all variables are simply converted to unicode

How to format objects for JavaScript ⌘

  • var myobj = {{ mydict }}; won't work
    • Reason: you want to transform to JSON, not to string with HTML escaping
  • This pattern works:
<script><!--
var myobj = {{ mydict|tojson }};
--></script>
  • With Flask before 0.10, you also needed to add |safe at the end

How to provide some variables to all templates ⌘

  • Common use cases: user name, cart item count, ...
    • They are usually displayed in the navigation bar and thus should be always available
  • Solution: write a context processor
@app.context_processor
def inject_user():
    return dict(user=current_user)

Template inheritance ⌘

  • You definitely don't want to write the whole HTML for every dynamic page
    • There are too many common parts, e.g. the general theme
    • Solution: create a base template and extend it
  • In the base template, mark the replaceable parts with {% block foo %}...{% endblock foo %}
    • The block tag both denotes a "pluggable hole" and the default content there
    • Child templates can override the content
    • Sometimes it is a good idea to create blocks both inside and outside certain HTML tags
      • So that you can either fill in the tag with the relevant content, or remove it completely
  • At the top of the child template, put {% extends "base.html" %}
    • Then, write the overrides for some blocks: {% block foo %}...{% endblock foo %}
    • {{ super() }} inside the overridden block gets back the default content

Static files ⌘

  • Images, javascript and css files, fonts, ...
  • Not processed by Python
  • Put them into the static folder
  • Flask will serve them from the /static URL
    • But you should really configure your web server to do that instead
    • And please don't hard-code /static in URLs
      • Please use url_for('static', filename='...') instead

Good starting points ⌘

  • Minimal:
  • Popular:
    • Does not need any introduction: Bootstrap
      • There is a Flask-Bootstrap extension that contains some handy functions
    • The most advanced responsive CSS framework: Foundation
      • This thing evolves very fast, beware!

What about static content like help pages? ⌘

  • Help pages cannot be static files
    • Navigation bar still contains dynamic content
    • Still, a simple solution for them all is wanted
  • What you probably want is the Flask-FlatPages extension
    • Serves flat pages from Markdown-based static files
    • The extension itself only returns a rendered piece of HTML for static content, it's your task to put it into a proper template
    • It's a good idea to use a catch-all route, to be added last
@app.route('/<path:path>/')
def page(path):
    page = pages.get_or_404(path)
    template = page.meta.get('template', 'flatpage.html')
    return render_template(template, page=page)
  • In a template, you can also access things like page.title, to be fetched from the YAML block at the top of the file

Returning JSON ⌘

  • Use the jsonify helper
from flask import jsonify

# in a view:
    return jsonify(some_object)
  • The helper in this case sets the correct MIME type (application/json)

Returning redirects ⌘

from flask import redirect

# in a view:
    return redirect(some_url)

Returning custom headers and error pages ⌘

  • You can return a Response object from your view
    • Make it with make_response()
    • Set response.headers['...'], mimetype and/or status_code
  • Altervatively, return a tuple of response_string, status, headers from a view function
    • Or even response_string, status_or_headers and Flask will guess what you mean
  • To return a pre-designed HTTP error page (e.g. a 404): abort(404)
  • To customize the 404 page:
from flask import render_template

@app.errorhandler(404)
def page_not_found(e):
    return render_template('404.html'), 404

Application context ⌘

  • WSGI interface is composable, so there may be more than one app
    • This is an advanced use case, we will not do that
    • Anyway, in a particular thread at a given moment, at most one is active
  • It is sometimes too cumbersome to pass the application everywhere
    • Solution: use the current_app context-local variable
    • It does not work if no application is active
      • with app.app_context(): current_app.do_something()
      • Pushes the application context down the stack, does something, then removes from the stack
      • This is only a problem that manifests itself in code outside view functions

Request context ⌘

  • Same story: due to internal redirects, there may be more than one request
    • This is also an advanced use case
    • Anyway, in a particular thread at a given moment, at most one request is active
  • Flask keeps the current request in the request context-local variable
    • It is sometimes useful to create a test request and push it down the stack

What's in the request object ⌘

  • method: the HTTP method, e.g. "GET" or "POST"
  • args: stuff after the question mark in the URL, parsed into something like a dictionary
  • form: same for POST variables (multipart/form-data)
    • These two are also combined into values
  • files: files uploaded via the form
  • cookies: incoming cookies
  • headers: HTTP headers
  • get_json(): here is how to convert the incoming application/json data into Python objects

On duplicate parameters: MultiDict ⌘

http://example.com/something?i=1&foo=bar&foo=baz

  • This is valid
  • request.args['foo'] takes the first value, i.e. bar
  • To get them all as a list, use request.args.getlist('foo')
    • No foo? Get an empty list
    • request.args['does_not_exist'] still raises a KeyError
  • A nice feature is that you can do type conversions like this: request.args.get('i', type=int)
    • A default value can also be passed

Validating requests ⌘

  • Of course it is possible to validate request.form or request.args by hand
  • A better alternative is to use a form validation library
    • Common choice: WTForms, via the Flask-WTF extension
  • Tasks simplified by WTForms:
    • Conversion between strings and native data types, both ways
      • E.g. sometimes you need to display something in a form for editing
    • Display of common HTML5 input-like tags
    • Arbitrary per-field or whole-form validation
    • Displaying useful error messages near the invalid field

Flask-WTF boilerplate ⌘

from flask_wtf import FlaskForm
from wtforms.fields import *
from wtforms.validators import *

# Some people prefer this:
# from flask_wtf import FlaskForm as BaseForm

# Then use FlaskForm (or BaseForm) as a base class for your forms, e.g.:

class MessageForm(FlaskForm):
    recipients = StringField('Recipients',
                             validators=[InputRequired(message='No recipients provided')])
    message = TextAreaField('Your message',
                             validators=[InputRequired(message='Message cannot be empty')])
    submit = SubmitField('Submit')

WTForms field types ⌘

  • Obvious: StringField, IntegerField, FloatField (rarely used), DecimalField
    • These render as <input type="text">
    • On form submission, they coerce the string into the correct type
  • Special rendering but nothing else: HiddenField, PasswordField, TextAreaField
  • Choice-related: SelectField, RadioField, SelectMultipleField
    • They accept the choice=... keyword argument, and render as the corresponding HTML5 tags
  • There are also FileField, DateField, DateTimeField, SubmitField, and others
  • You can create custom fields, too
    • Out of scope for this training

Typical structure of a view with a form ⌘

  • Typical usage pattern with Flask-WTF:
@app.route('/submit', methods=('GET', 'POST'))
def submit():
    form = MyForm()
    if form.validate_on_submit():
        do_something()
        return redirect('/success')
    return render_template('submit.html', form=form)
  • The validate_on_submit method comes from Flask-WTF, not wtforms!
  • If you need to pass form data from sources other than request.form
    • Pass the dictionary as an argument to form.validate
    • Maybe disable CSRF validation via keyword argument: meta={'csrf': False}
  • If you want to edit some ORM object, pass it as obj keyword parameter to the form constructor
    • It will be checked for matching attributes if there is nothing POSTed

How WTForms validates the form ⌘

  • It calls all field validators until one raises StopValidation
    • This includes validators passed during field construction, and the validate_fieldname() method on the form
  • If you need form-level validation, override the validate method
    • The real question without a good answer is where to put the resulting errors

How to get the data and errors from the form ⌘

  • Many ways to access validated data:
    • form.my_field.data - None if there were errors
    • form.data - expensive, generated each time you access it
    • form.populate_obj(some_object) - may be useful with ORMs
  • And get errors:
    • form.my_field.errors - this is a list of messages
    • form.errors - this is a dictionary which contains only fields with errors (name => list of messages)

WTForms built-in validators ⌘

  • All validators take a message as their first argument
  • Optional: put it first if an empty field should not be checked by other validators
  • InputRequired: fails if the user entered nothing
  • DataRequired: fails if the user did not enter anything that can be coerced into a truthy value of the target type
    • Almost never useful - bans 0, 0.0 and similar values for integer and decimal fields
  • Length, NumberRange
  • Regexp, Email, URL
  • AnyOf, NoneOf
  • There are others
  • You can create custom validators
    • Look at the source, create classes in the same way, raise ValidationError when needed

How to display form fields ⌘

  • Just mention them like this: {{ form.myfield }}
  • Or, call them to set additional HTML attributes: {{ form.myfield(class_="fancy") }}
  • There is also a pre-made label for each field: {{ form.myfield.label }}
    • This is for the use case when the label is near the field's input tag
    • The other possibility offered by the HTML standard is to make the input a subelement of the label
      • WTForms doesn't do this, but you can hand-code it
  • You should be able to display errors based on form.myfield.errors
  • The form itself does not have an __html__ method
    • But you can iterate over it to get the fields
  • Don't forget {{ form.hidden_tag() }} (comes from Flask-WTF, for CSRF protection)

Dealing with file uploads ⌘

  • Werkzeug has request.files which is a MultiDict of FileStorage objects
    • You can use it directly
    • Useful properties: filename, content_type, content_length
    • You can call a save method to save it to a file
    • Or you can access it via the stream property (either in-memory representation or a temporary file
  • WTForms don't handle file uploads
  • Flask-WTF has FileField
    • form.my_file_field.data reefers to Werkzeug's FileStorage

File upload security ⌘

  • Hackers can provide any filename
    • Including ../../../../../tmp/hacked.txt!
    • Or evil.jpg.php
    • Be careful
    • Whitelist allowed mime types and extensions
    • Check filenames
      • werkzeug.utils.secure_filename strips out the bad stuff but can return an empty or non-unique filename
    • Always rename the uploaded file to something safe and unique
    • Limit the maximum file size
  • The Flask-Upload extension can help you with these tasks
    • It also supports serving uploaded files

Configuration system ⌘

  • app.config looks like a dictionary
  • You can also update it from various sources
    • app.config.from_object
      • Imports the module provided as an argument, takes everything that has all-upsrcase names
    • app.config.from_envvar
      • Loads the file with the name provided by the argument
      • it should contain lines like DEBUG = False

Mandatory configuration ⌘

  • SECRET_KEY = 'some long string that nobody can guess'
    • Used for secure cookies in sessions

Handling non-GET requests ⌘

  • By default, @app.route(...) registers only the handler for GET requests
    • Add the methods=("GET", "POST") keyword argument to override the default
    • GET implies HEAD
    • OPTIONS is always implied and, if necessary, routed to the internal handler

Session ⌘

  • A way to store some data between requests made by the same client
    • Just use flask.session as a dict that can only hold serializable values
  • Flask session is, by default, based on signed cookies
    • Cookies are signed using the secret key, so keep it non-guessable
  • Limitations:
    • You cannot store secrets there (a client can do URL-safe base64 decoding and gzip decompression to get the data)
    • You cannot store too much data

Flashing messages ⌘

  • Flashing a message means storing it at the end of the request so that it can be displayed at the next request
  • Here is how to flash a message:
from flask import flash

...
# in a view function
flash('A message', 'error')
  • To get flashed messages, use the get_flashed_messages() function in a template. Something like this:
{% with messages = get_flashed_messages(with_categories=True) %}
  {% if messages %}
    <ul class=flashes>
    {% for category, message in messages %}
      <li class="{{ category }}">{{ message }}</li>
    {% endfor %}
    </ul>
  {% endif %}
{% endwith %}
  • The names of message severities do not match those used by Bootstrap
    • Flask-Bootstrap contains an utility to display them, too:
{% import "bootstrap/utils.html" as utils %}
{% with messages = get_flashed_messages(with_categories=True) %}
  {% if messages %}
    <div class="row">
      <div class="col-md-12">
        {{ utils.flashed_messages(messages) }}
      </div>
    </div>
  {% endif %}
{% endwith %}

How to structure a Flask application ⌘

  • Ugly and fragile circular imports are a common issue
  • The official documentation dismisses them as something acceptable, but in fact they are avoidable
  • In app.py, create the application and populate its config
    • Do nothing else
  • In models.py, initialize the ORM-related extension and describe models
    • You will need to import app, or use init_app() later
  • If you want to use Flask-Admin for the administrative interface, do everything related in admin.py
    • You will need to import app and all models
  • In forms.py, describe the forms
    • A common way to do so is to use wtforms via Flask-WTF
      • As with any Flask extension, you need to import the app or use init_app() later
    • Some forms will be based on models, so you need to import them
  • In other files, do other things needed by your application
  • Collect everything in main.py, initialize all extensions
    • And then point "flask run" to it

Signals ⌘

  • Signals are a way to decouple policy from mechanism
    • Policy: what should be done when a user registers?
    • Mechanism: what does it mean that the user registered?
  • Signals are available when the blinker library is installed
import blinker
signals = blinker.Namespace()
user_registered = signals.signal("user-registered")
  • How to emit:
    user_registered.send(app._get_current_object(),
                         user=user, confirm_token=token)
  • How to subscribe:
def my_callback(sender, user, confirm_token, **extra):
    pass

user_registered.connect(my_callback)
  • Flask has some built-in signals like template_rendered, request_started
    • They are useful for profiling and tracing

Decorator-based hooks ⌘

  • Flask has an ability to register functions that run at certain stages of request processing
    • Unlike signals, they can modify the request or response, or even prevent the normal view function from running
    • See after_request(), before_first_request(), before_request(), errorhandler(), teardown_appcontext(), teardown_request()

Blueprints ⌘

  • Logical parts of the app
    • Can contain view functions, templates, static files
  • If you think "let's convert my views.py file into a package", better think about blueprints
from flask import Blueprint

bp = Blueprint('captcha', __name__)  # and possibly other keyword arguments

# Then use it like an app. E.g.:
@bp.route('/captcha.jpg')
def captcha():
    pass

# Any setup code goes here
def bp_setup(state):
    app = state.app
    blueprint = state.blueprint
    # ...

bp.record_once(bp_setup)
  • You have to register your blueprints on an app
from captcha import bp as captcha_bp

app.register_blueprint(captcha_bp)

MVC ⌘

  • Model, View, Controller
    • A common pattern for application design
    • Or maybe just a buzzword that only indicates that Models, Views and Controllers exist, with some separation of concerns
    • Flask is not an MVC framework, but your application can still be based on this pattern
  • Wikipedia's articles on MVC, MVP (Model, View, Presenter) and similar software architectures are written with classical GUI-based software in mind
    • The English Wikipedia MVC interaction diagram (below) doesn't apply to web frameworks
    • Non-English editions of Wikipedia even have different interaction diagrams!

MVC as described on English Wikipedia MVC as described on German Wikipedia MVC as described on French Wikipedia

MVC in web frameworks: Model ⌘

  • Model:
    • Has business-domain knowledge (both logic and data)
    • Provides API that can be used by the controller to read state and submit updates
    • Does not contain presentation logic
    • Note: in many other ORMs and web frameworks, a model is only an abstraction over data storage (such as database) and validation mechanism. I.e. there is no accent on the API that makes sense in the business problem domain.
  • With Flask, the model layer is to be supplied by you, the developer
    • You can base it on SQLAlchemy for data persistence

MVC in web frameworks: View ⌘

  • View:
    • Presents data to the user
    • Usually as HTML
      • Data can come from the model and from the controller
  • Templates!

MVC in web frameworks: Controller ⌘

  • Controller:
    • Reacts to user actions (in the form of HTTP requests)
    • Decides what to do
    • Updates model state, or queries state from the model
    • Asks the appropriate view to render itself, supplies the necessary data
  • Frequently advocated approach: fat models, skinny controllers
  • View functions!

Database topics ⌘

  • Making SQL queries from Python
  • Ways to avoid SQL injection
  • SQLAlchemy vs Peewee
  • Defining tables
  • Selecting data
  • Joins
  • Inserting, updating and deleting data
  • Transactions

Making SQL queries from Python ⌘

  • DB API 2.0 specification
    • A standard which database-interfacing modules follow
    • You can connect to the database, issue queries, get results
    • There is a mechanism for using prepared queries and substitution of parameters
import sqlite3
db = sqlite3.connect('test.sqlite')
cur = db.cursor()
cur.execute('SELECT id, dname FROM dept WHERE loc = ?', ('Berlin',))
rows = list(cur)
db.commit()

SQL injection ⌘

  • A serious vulnerability
  • Appears when an application constructs SQL query dynamically by concatenating fixed strings with attacker-provided data
  • By using special characters (like ') inside their data, attacker can trick SQL database to interpret their data as additional SQL queries
    • Can bypass access restrictions, exfiltrate data, make unauthorized changes

How to avoid SQL injection ⌘

  • Answer from 1990s: forbid bad characters in user input
    • Bad: how would you register Catherine Anne O'Hara as a user?
  • Answer from 2000s: "escape" user data before concatenation so that special characters are properly interpreted as data, not as SQL syntax
    • Bad: too hard to keep track what's escaped and what's not
    • Too easy to forget to escape data
  • Answer from 2010s: don't build queries by concatenating fixed strings with user-provided data
    • "Prepared Statement" APIs exist that clearly separate SQL from external data in queries
      • Usually (but not always), they stay separate on the wire. If the database doesn't provide this option, the client library will escape and concatenate strings as necessary.

How to avoid SQL injection, once again ⌘

Don't:

cur.execute("SELECT id, dname FROM dept WHERE loc = '%s'" % (city,))

Do:

cur.execute('SELECT id, dname FROM dept WHERE loc = ?', (city,))

Or just let your favorite ORM generate queries for you.

Why we need ORMs ⌘

  • SQLite on dev, PostgreSQL in production
    • Slight differences in SQL syntax
    • E.g. here is how to create a suitable id field in SQL:
      • SQLite: id INTEGER PRIMARY KEY (autoincrement is optional, and the AUTOINCREMENT keyword, if present, must be placed last)
      • MySQL: id INTEGER AUTO_INCREMENT PRIMARY KEY
      • PostgreSQL: id SERIAL PRIMARY KEY
  • Construct queries programmatically
    • Important if a user should be able to e.g. filter or sort records on ultiple criteria
    • "IN" queries are just cumbersome with raw DB API
  • Access result rows in object-oriented way
    • There is an impedance mismatch between object-oriented and relational world
    • Conversion is trivial for integers and strings, but not for relations

Relations ⌘

  • 1:1: don't do this, just put both sides into the same table
    • If you must (because of physical storage considerations), just use the same primary key and write helpers or joins to retrieve the other side
  • 1:many: add a non-NULL reference field to the "many" side that references the "1" side
    • ORM will convert this into a maybe-lazily-loaded collection on "1:" side and to the link to the other object on the ":many" side
  • 1:0-1: add a non-NULL unique reference field to the "0-1" side that references the "1" side
  • many:many: use an intermediate table that has two non-NULL fields that reference both sides
    • ORM will represent the other side as a collection

Normalization ⌘

  • Goal: make sure that the very structure of the database makes it impossible to have inconsistent data
    • If you don't update your data, denormalization may be OK
  • Traditional theory of relational databases doesn't really accept IDs
  • Always think "if there is no ID, what else could serve as a key?"
  • Useful rule of thumb, approximately equivalent to BCNF (also known as 3.5NF):
    • Each attribute must provide a fact about the key (1NF), the whole key (2NF), and nothing but the key (BCNF).
      • Both 2NF and 3NF are concerned equally with all candidate keys of a table and not just any one key.
      • So help me Codd!

Common ORMS ⌘

  • SQLAlchemy: the most powerful one
    • Adopts the Unit of Work pattern
    • The application sometimes becomes a bit too verbose
  • Peewee: the lightweight one
    • Adopts the Active Record pattern
    • Much easier to learn, but has limitations
  • There are others but they are not commonly used with Flask

How to learn an ORM ⌘

  • Describing tables
    • Migration tools
  • Selecting by primary key
  • Selecting by other attributes
  • Joining tables
  • Inserting, updating and deleting records
    • Mass updates and deletes
  • Data validation
  • Transactions

Flask-SQLAlchemy: boilerplate ⌘

from flask import Flask
from flask_sqlalchemy import SQLAlchemy

app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:////tmp/test.db'
db = SQLAlchemy(app)

# Now use db.Model as a base class for your models
# also db.* contains everything from sqlalchemy and sqlalchemy.orm

How to describe database tables with SQLAlchemy ⌘

class User(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    # ...

class Article(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    text = db.Column(db.Text)
    date = db.Column(db.DateTime)
    author_id = db.Column(db.Integer, db.ForeignKey('user.id'))
    author = db.relationship('User', backref='articles')

SQLAlchemy session ⌘

  • With SQLAlchemy, you need to construct relationships between objects by setting the attributes that were declared as db.relationship, not by manipulating IDs
  • The objects stay in the session (db.session)
    • Newly constructed objects have to be added there first
  • When you flush the session, the objects are saved to the database and get the IDs
    • Other database clients will see your records when you commit the transaction
    • flush != commit. Session flushing and transactions are different concepts
>>> db.create_all()
>>> u = User()
>>> u.id
>>> a = Article()
>>> a.text = 'Something new'
>>> a.author = u
>>> db.session.add(u)
>>> db.session.add(a)
>>> db.session.flush()
>>> u.id
1
>>> a.id
1
>>> a.author_id
1
>>> db.session.commit()

Querying the database ⌘

>>> # get by id
>>> a = Article.query.get(1)
>>> a.text
'Something new'
>>> # get by something else - note the first() to get only one item instead of a collection
>>> b = Article.query.filter(Article.date == None).first()
>>> b.id
1

Performing joins ⌘

  • Here is how:
Article.query.join(User).filter(...)
  • The result is still a collection of Articles
    • The benefit is that you can filter on User-related conditions
    • SQLAlchemy will figure out the join condition automatically in most cases

Aggregation ⌘

  • Counting records: Article.query.filter(...).count()
  • Getting min, max, sum, ...:
from sqlalchemy.sql import func
db.session.query(func.max(Article.date)).filter(...).scalar()
  • There are special constructions for grouping, too
  • It is hard to come up with a query that cannot be expressed using SQLAlchemy
    • But this is not a course on SQLAlchemy, so please learn these advanced topics yourself when needed

Updating and deleting records ⌘

  • If you have the record:
    • db.session.delete(record) (or update fields as needed) and then db.session.flush()
  • If you want mass-update or mass-delete:
    • Article.query.filter(...).delete() or .update(...)
    • This bypasses the session, use with care

Peewee ORM: boilerplate with Flask-Peewee ⌘

from flask import Flask
from flask_peewee.db import Database

app = Flask(__name__)

app.config['DATABASE'] = {
    'name': '/tmp/test.db',
    'engine': 'peewee.SqliteDatabase',
}

db = Database(app)

# Now use db.Model as a base class for your models

Peewee ORM: raw boilerplate ⌘

from flask import Flask
from peewee import *

app = Flask(__name__)
database = SqliteDatabase('/tmp/test.db')

# This hook ensures that a connection is opened to handle any queries
# generated by the request.
@app.before_request
def _db_connect():
    database.connect()

# This hook ensures that the connection is closed when we've finished
# processing the request.
@app.teardown_request
def _db_close(exc):
    if not database.is_closed():
        database.close()

# A common pattern is to create a base model class
# so that all models know that they belong to this database
class BaseModel(Model):
    class Meta:
        database = database

How to describe database tables with Peewee ⌘

class User(db.Model):
    # With Peewee, id is implicit
    ...
    pass

class Article(db.Model):
    text = TextField()
    date = DateTimeField(default=datetime.datetime.now)
    # Note: you don't have to describe the column and the relationship separately
    author = ForeignKeyField(User, related_name='articles')

Active Record pattern ⌘

  • Peewee does not use Unit of Work and does not have a Session
  • Instead, each model object knows (via Meta) which database it belongs to
  • To persist it, call .save() on it
    • Unlike SQLAlchemy, you have to track dependencies manually
    • By default, an INSERT will be generated if the id is not set, and an UPDATE if id is known
  • To delete it, call .delete_instance() on it
    • Don't confuse with .delete which constructs a query which deletes all records matching a condition from the table!
  • By default, autocommit is True

Querying the database ⌘

>>> # get by id
>>> a = Article.get(id=1)
>>> a.text
'Something new'
>>> # get by something else - note the first() to get only one item instead of a collection
>>> b = Article.select().where(Article.date == None).first()
>>> b.id
1

Joins ⌘

  • With Peewee, joins are as easy as with SQLAlchemy
  • And they work exactly the same
Article.select().join(User).where(...)

Aggregation ⌘

  • Counting records: User.select().count()
  • Finding minimum or maximum: Article.select(fn.max(Article.date)).where(...).scalar()

Database evolution ⌘

  • It's a fact that fields and tables are added, removed, or otherwise modified
    • One can talk about schema migrations and data migrations
  • You can do it by hand, or by a script (i.e. create a migration)
  • SQLAlchemy: no built-in support for migrations
    • Frequently used with Alembic, maybe via Flask-Migrate
      • Schema migrations are auto-generated, but can be edited
      • Data migrations are possible but only documented in external blogs
    • Alternative: SQLAlchemy-migrate
      • Used e.g. in OpenStack
      • No Flask extension
  • Peewee: some built-in schema migrations but no built-in tracking of what has been applied
    • No clear winner among tracker tools
    • peewee-db-evolve is an interesting idea: fully automated schema migrations that introspect the current schema and diff it against the models
      • No Flask extension, and extension is not needed
  • Migration tools don't really work with SQLite due to incomplete support for ALTER TABLE

Model-based forms ⌘

  • Sometimes it is a good idea to base a form on a set of fields from a model, plus or minus some exceptions
    • The most secure practice is to whitelist fields
  • For SQLAlchemy models, use WTForms-Alchemy
from wtforms_alchemy import ModelForm
from myproject.myapp.models import User

class UserForm(ModelForm):
    class Meta:
        model = User
        only = ['first_name', 'last_name', 'email']
  • Peewee support is provided via the wtfpeewee package
from wtfpeewee.orm import model_form
from myproject.myapp.models import User
UserForm = model_form(User, only=('first_name', 'last_name', 'email'))

Admin interface ⌘

  • Smart idea: create a model_form from each model, and expose via dynamically-generated views
    • That's what Flask-Admin does
    • It also has a component for managing files on disk
    • It's your responsibility to properly restrict access to the admin interface
  • Boilerplate (without any security):
from flask import Flask
from flask_admin import Admin

from flask_admin.contrib.sqla import ModelView
# or: from flask_admin.contrib.peewee import ModelView

app = Flask(__name__)

admin = Admin(app)

admin.add_view(ModelView(User, db.session))
admin.add_view(ModelView(Article, db.session))
# Peewee: there is no session
  • For authentication, subclass ModelView and define the is_accessible method which should return a boolean

Security-related extensions ⌘

  • Flask-Login
    • The most basic one
      • Tracks the currently logged-in user, stores it in the session
      • Allows to protect some views with the @login_required decorator
    • You have to implement the login procedure (e.g. the password check against your models) yourself
  • Flask-Security
    • Implements login, registration, password change, password recovery via email
      • In a way that may or may not be suitable for you - but allows some customization
    • Follows best practice regarding password storage
    • Supports both SQLAlchemy and Peewee
  • Flask-User
    • Was written because the author had found Flask-Security too difficult to customize
    • Supports only SQLAlchemy out of the box
  • There are also packages for the use case where the authentication data is not stored in models
    • Flask-SimpleLDAP, Flask-LDAP-Login, Flask-CAS, Flask-Social, ...

What's wrong with Flask-Security ⌘

  • User enumeration vulnerability open since 2015
  • Insists too much on users having emails
    • Possible, but non-trivial, to overcome
  • Implements a policy of sending emails on events like password change
    • This is configurable via app.config, with the default being on
    • It would have been better if Flask-Security just provided a signal that it triggers on such events
  • Possibly implements something else that doesn't match your requirements
    • Test it thoroughly

How to protect the entire application with login ⌘

  • Look how the login_required decorator in Flask-login works to understand the code below
@app.before_request
def check_valid_login():
    login_valid = current_user.is_authenticated
    page_is_public = request.endpoint and request.endpoint in ('static', 'security.login')
    if not (login_valid or page_is_public):
        return current_app.login_manager.unauthorized()

Optimizing static assets ⌘

  • Client-side caching is an important loading-time optimization
  • Client is told to cache all JS and CS files by setting the Expires: header to point far into the future
  • Problem: what if the file changes?
    • Solution: don't reuse the URL
    • Include some version number in it
      • See how Flask-Bootstrap does it
  • Concatenate and minify your JS and CSS files
    • This avoids issuing multiple requests
    • Flask-Assets can do it for you

Sending email ⌘

  • Of course you can use email and smtplib modules manually
  • Common tasks arise then
    • Central configuration of SMTP server credentials
    • Suppressing emails centrally during unit tests or on developers' machines
  • Solution: Flask-Mail

Further reading ⌘