Web2py
Jump to navigation
Jump to search
Course plan ⌘
- Day 1: Introduction, Routing, Controllers, Views
- Day 2: DAL
- Day 3: Access control, I18N, Validators, Forms
- Day 4: JS, CSS, Ajax, Deployment
- We aim at ~45 slides per day
Let's start from the basics ⌘
- WSGI
- Purpose of web frameworks
- Available web frameworks
- Place of web2py in the ecosystem
HTTP protocol ⌘
- HTTP/1.1 is defined by RFC 7230 - RFC 7235
- Browsers are the most popular clients
- There are non-browser clients
- Web servers deliver HTML, images, JS, CSS, downloadable files, other content types
HTTP Request ⌘
GET /html/rfc7240 HTTP/1.1 Host: tools.ietf.org Connection: keep-alive Pragma: no-cache Cache-Control: no-cache Upgrade-Insecure-Requests: 1 User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.90 Safari/537.36 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 Accept-Encoding: gzip, deflate, sdch, br Accept-Language: en-US,en;q=0.8,ru;q=0.6 Cookie: __cfduid=d105aee619396e6187d19c60cb032f4b71479467417
HTTP Response ⌘
HTTP/1.1 200 OK Server: Apache/2.2.22 (Debian) Last-Modified: Sun, 13 Nov 2016 10:39:20 GMT ETag: "3cc378-c260-5412c55f8017f;54191604efade" Accept-Ranges: bytes Cache-Control: max-age=604800 Expires: Fri, 25 Nov 2016 11:15:54 GMT Strict-Transport-Security: max-age=3600 Content-Length: 12952 Keep-Alive: timeout=5, max=100 Connection: Keep-Alive Content-Type: text/html; charset=UTF-8 <other headers omitted for brevity> <!DOCTYPE html ...
Python web applications ⌘
- WSGI is the interface between the web server and the web application
- Defined in PEP 3333
- The web server calls a callable for every request
- usually "application", but the name is configurable
- It should take two parameters: environ and start_response
- environ is a dict with CGI-like parameters
- also contains the "input" stream used for POST content
- start_response(status, response_headers, exc_info=None) is a callback
- should be used by the application to set the response headers and status code
- environ is a dict with CGI-like parameters
- It should return an iterable which supplies pieces of content
Simplest Python web application ⌘
(use it with Apache's mod_wsgi, or run from command line)
#!/usr/bin/python2
from wsgiref.simple_server import make_server
def application(environ, start_response):
status = '200 OK'
output = 'Hello World!'
response_headers = [('Content-type', 'text/plain'),
('Content-Length', str(len(output)))]
start_response(status, response_headers)
return [output]
if __name__ == "__main__":
httpd = make_server('', 8000, application)
httpd.serve_forever()
Exercise: try the raw WSGI ⌘
- Can you print the complete WSGI environment to the browser?
- Can you handle two different URLs? ("Hello world" and environment)
Tasks of a web framework ⌘
- Routing
- Sessions
- Response formatting
- Input validation
- Database communication
- Structuring your application
Available web frameworks ⌘
- Batteries included
- Self-contained and comprehensive
- Examples: Django, TurboGears, web2py
- Best of breed
- Just a thin layer of glue between other libraries
- Examples: Flask, Pyramid, CherryPy, Bottle
web2py as compared to other web frameworks ⌘
- Full-stack
- MVC
- Quite different from all other frameworks
- It is recommended that you also learn some other framework, just to see what's specific to web2py
- Flask is recommended as something diametrally opposite
- A bit old-fashioned:
- Does not rely on gettext, uses its own i18n mechanism
- Does not use unicode internally
- Unique features:
- Has a web-based IDE
- Uses Python as a language for views
- Uses Database Abstraction Layer (DAL) instead of Object-Relational Mapper (ORM)
- Requires no installation
- Has a lot of built-in functionality
web2py criticisms that are still valid ⌘
- Non-traditional use of
execfile
instead ofimport
(1) - Too different from established Python idioms (2)
- Can teach you things that are frowned upon by general Python community
Download ⌘
Don't install! ⌘
- No installation needed
- Don't install web2py from pip - it's outdated there
- Note: there's no traditional setup.py
- It still may be a good idea to use virtualenv, due to other packages
- Debian and Ubuntu adapted web2py for their packaging system
- No reverse dependencies
- Reasonably modern only in Debian Testing and Ubuntu 16.04+
- There is danger of it being removed due to licensing issues
- Absolutely no support, at least two known holes
First steps ⌘
- Package contents and directory structure
- Running the default applications
Package contents ⌘
- ./web2py.py: the main script (executable!)
- ./Makefile: builds documentation and installers
- ./applications: default applications and your applications
- ./deposit: if you remove applications, they will be archived here
- ./gluon: the web framework sits here
- ./gluon/packages/dal: Database Abstraction Layer
- ./handlers, ./examples, ./scripts: example integrations with web servers
- ./logs: by default, logs go here
- ./site-packages: place for third-party Python modules
Running the development server ⌘
python2 web2py.py
- Will ask for port and admin password
- Optional GUI uses Tk
- Help:
python2 web2py.py --help
- Shell in some application:
python2 web2py.py -S welcome
- It's a good idea to have IPython (now Jupyter) installed
- Add
-M
to load models
Structure of an application ⌘
- ./models/*.py: models
- You can create subdirectories if some models are needed only by some controllers
- ./views: views (mostly HTML with special syntax for printing, branching and looping)
- ./controllers/*.py: controllers
- ./controllers/default.py: default controller
- ./controllers/appadmin.py: application admin (use for manipulating database tables)
- ./modules/*.py: your own importable Python modules
- ./private: non-Python files private to the application
- e.g. appconfig.ini
- ./sessions: sessions are stored here by default
- It is also possible to store them in a database
- ./static: static files (js, css, fonts, images, ...)
- are not processed by Python
- ./uploads: files that your users upload are renamed and stored here
MVC ⌘
- Model, View, Controller
- Essentially a buzzword that only indicates that Models, Views and Controllers exist, with some separation of concerns
- Wikipedia's articles on MVC, MVP (Model, View, Presenter) and similar software architectures are written with classical GUI-based software in mind
- The English Wikipedia MVC interaction diagram (below) doesn't apply to web frameworks
- In web2py and most other web frameworks, the model does not actively update the view
- Non-English editions of Wikipedia even have different interaction diagrams!
MVC in web frameworks: Model ⌘
- Model:
- Has business-domain knowledge (both logic and data)
- Provides API that can be used by the controller to read state and submit updates
- Does not contain presentation logic
- Note: in many other ORMs and web frameworks (not web2py!), a model is only an abstraction over data storage (such as database) and validation mechanism. I.e. there is no accent on the API that makes sense in the business problem domain.
- Good article on the issue, by Julien Pauli
- In web2py, the logic that detects which parts of the website menu should exist is part of the model. Even if the menu is not stored in the database.
- Even the data you get from web services can be expressed as a Model!
MVC in web frameworks: View ⌘
- View:
- Presents data to the user
- Usually as HTML
- Data can come from the model and from the controller
MVC in web frameworks: Controller ⌘
- Controller:
- Reacts to user actions (in the form of HTTP requests)
- Decides what to do
- Updates model state, or queries state from the model
- Asks the appropriate view to render itself, supplies the necessary data
- Frequently advocated approach: fat models, skinny controllers
Routing ⌘
- Mapping URLs to handlers
- I.e. to functions in controllers
- Accessing URL parameters
Default routing ⌘
- http://localhost:8000/app/controller/function.ext/arg0/arg1?foo=bar&a=b
- applications/app/controller.py is executed in a special environment
function()
is called without arguments- Can return a string
- Can raise an
HTTP
exception - Can return a dict - in this case, a view is executed
return locals()
is a common idiom
- arg0 and arg1 go to
request.args
- foo=bar&a=b go to
request.vars
- There are also
request.cookies
- There is a
URL()
function to construct a URL from app, controller, function, args and vars
Exposed functions ⌘
- http://localhost:8000/app/controller/do_something_auxiliary
- Can this be called by an attacker?
- The names of exposed functions come from a regex match over the controller source
- Details are in
gluon.compileapp
andgluon.myregex
# pattern to find exposed functions in controller
regex_expose = re.compile(
'^def\s+(?P<name>_?[a-zA-Z0-9]\w*)\( *\)\s*:',
flags=re.M)
Non-exposed functions ⌘
- So, here is what won't match NOW:
- Imported functions
- Functions with arguments
- Functions starting with more than one underscore
- Functions defined under
if True:
- Earlier there was an advice to include a space before the ":" - it doesn't work now
- Assume that your function WILL be exposed via URL, unless official documentation says it won't
- The documentation only says: "Functions that take arguments or start with a double underscore are not publicly exposed and can only be called by other functions."
- Yeah, the situation with imported functions is stupid
Custom routing ⌘
- / maps to the "init" app (or "welcome"), "default" controller, "index()" function
- Names can be changed in routes.py
default_application
,default_controller
,default_function
- You can also define
routes_in
androutes_out
(as tuples of 2-tuples)routes_in
maps HTTP URL patterns to /app/controller/functionroutes_out
maps /app/controller/function patterns to URLs (for theURL()
function)
Routing in other web frameworks ⌘
- The existence of default routing is web2py-specific
- By extension, so are application, controller and function names in the URL
- Other frameworks require explicit export of their equivalent of controller functions
- Django: needs an entry in urls.py that maps a URL regex to a view
- Flask: needs a decorator that applies the URL template, like this:
@app.route("/all_topics/<int:page>")
- This also passes the page as the parameter - no need to mess with request.args
- In other frameworks, there is no real need to use any equivalent of
request.vars
- this is usually handled (including the necessary validation) by form classes- You can do this in web2py, too - but disable CSRF protection
Views ⌘
- Default view for controller.function:
views/controller/function.ext
- So the same data can be formatted as HTML, JSON or whatever else
- There are also generic views, enabled by default only for requests from localhost
View syntax ⌘
- Write whatever HTML is needed - it will be delivered to the browser
- Add Python code in {{ ... }} braces
- Indentation is ignored, the code is reformatted internally
- ":" indents, "pass" unindents
- There is some special syntax for printing variables: {{=something}}
- Something will be converted to HTML as appropriate
- Thus avoiding XSS attacks
- Variables come from the dict returned from the controller
- Some other objects are also available
- Something will be converted to HTML as appropriate
- Special syntax for extending and including other views, to be discussed later
XSS attacks ⌘
- XSS = Cross-Site Scripting
- Interpretation of malicious data as HTML (or, worse, JavaScript) by the browser
- Reflected XSS: data from the query string are interpreted as HTML
- Stored XSS: data from the database are interpreted as HTML
- Typical exploitation scenario: write a "blog post" that also sends
document.cookie
to attacker when viewed by admin- Admin's session is now hijacked!
- Solution: convert data from text to HTML
- Also known as "escaping"
Implicit "imports" ⌘
- Not real imports - all files are effectively executed in the same namespace
- The result is quite close to concatenation, but with bytecode caching
- This does mean that all your models are redefined on every request - something that other web frameworks avoid
- In controllers, web2py implicitly "imports" everything from your models
- In views, web2py automatically puts into the global scope the dict returned by your function
- These implicit "imports" often confuse IDEs
- Solution: buy IDE with special support
- PyCharm Professional Edition qualifies, Community Edition doesn't
- They also confuse pylint
- Solution: install pylint-web2py
How objects are printed ⌘
- If
.xml()
is callable, it is called, and the result is treated as HTML- Hint: if you already have some known-good HTML, construct an
XML(some_html)
from it
- Hint: if you already have some known-good HTML, construct an
- Otherwise, the object is transformed to a string, escaped, and printed
- Implementation details:
- The parser is in
gluon.template.TemplateParser.parse()
- It creates a
Node
object that containsresponse.write(something)
- Then see
gluon.globals.Response.write()
- It calls
self.body.write(to_native(xmlescape(data)))
- Finally, see
gluon.html.xmlescape()
- It calls
- The parser is in
How to format objects for JavaScript ⌘
var myobj = {{=mydict}};
won't work- Reason: you want to transform to JSON, not to string with HTML escaping
- This pattern works:
<!-- at the top -->
{{ from gluon.serializers import json }}
<script><!--
var myobj = {{=XML(json(mydict))}};
--></script>
json()
converts dict to a JSON stringXML()
prevents HTML escaping- There is also
ASSIGNJS()
helper that produces the assignments
Exercise: available globals ⌘
- Figure out what is available as globals in your controllers
- Same for views
Storage objects ⌘
- Importable from
gluon.storage
class Storage
inherits from dict- Also allows to get or set keys using the attribute syntax:
st.foo = bar
- Difference: undefined attributes are created on-the-fly and become None (so no KeyError)
- Also allows to get or set keys using the attribute syntax:
- There is also
class StorageList
- Difference: undefined attributes are [], not None
- Useful for keeping lists
- Difference: undefined attributes are [], not None
- There is also
class List
- Like list, but can be addressed using (...), not only [...] syntax
- Does not raise
IndexError
when used with (...)
- Does not raise
- Like list, but can be addressed using (...), not only [...] syntax
Request ⌘
- Available as
request
in controllers and views - Has some interesting attributes
request.args
: a List with parts from the URL after function.extrequest.get_vars
: GET-style parameters from the URL, as a Storage- There are also
request.post_vars
, andrequest.vars
contains both GET and POST vars request.env
: WSGI environment, as a Storagerequest.env.http_accept-language
is a value of the Accept-Language HTTP headerrequest.env.request_method
is the request method
request.cookies
: cookies. Not a Storage.request.application
,request.controller
,request.function
,request.extension
request.is_local
,request.is_https
Figuring out the client's IP address ⌘
request.env.remote_addr
is always the correct answer- If it isn't, you are not running the web server in the recommended configuration
- E.g. you are proxying to Rocket
- Or proxying to Apache and forgot to install mod_remoteip
- If it isn't, you are not running the web server in the recommended configuration
- It is OK to use
request.is_local
- False leads:
request.env.http_x_forwarded_for
- Verify that it only comes from your proxy
- Try to fake it using Firefox and this extension, see how your system reacts
request.client
- Should never be used - tries X-Forwarded-For first (and uses the first address there), and REMOTE_ADDR next
- In Django, all use of X-Forwarded-* was treated as a vulnerability and removed
If you still want to use reverse proxy ⌘
- Please unset these incoming headers unconditionally
- X-Forwarded-For
- X-Forwarded-Proto
- Make sure that the Host: header is correct
Accessing request.vars securely ⌘
- Attack vector: http://example.com/a/c/f?foo=bar&foo=baz
- Result:
request.vars.foo == ['bar', 'baz']
- A list, not a string!
- Result:
- Solution: always use one of these:
request.vars.getfirst('foo')
request.vars.getlist('foo')
request.vars.getlast('foo')
request.folder ⌘
- The base filder of the application
- Contains
models
,views
andcontrollers
subdirectories - Please use
request.folder
to construct application-specific paths, like illustrated below- Example from the default admin app
deny_file = os.path.join(request.folder, 'private', 'hosts.deny')
- You can use this even in models. That's business logic, after all...
Response ⌘
- This is a subclass of
Storage
- Fields with obvious meaning:
status
(HTTP error code),headers
(a dict),cookies
- Fields used by the default layout.html file:
title
,subtitle
flash
(a string or object to be flashed on the page),meta
(Storage for meta tags),menu
,files
(JS and CSS from static files)- Use something like
response.files.append(URL('static','css/bootstrap.min.css'))
- Put
{{ response.include_meta() }}
and{{ response.include_files() }}
in your layout.html- It will minify and virtually version your static files
- A debugging/profiling toolbar:
{{=response.toolbar()}}
- Note the difference: you have to display the return value
- You can add anything else - this is sometimes useful for communicating information from a view to the layout
Session ⌘
- Per-client
Storage
that survives between requests - Separate session per application
- Identifier stored in a cookie
- Contents stored on disk as a file by default
- Can be reconfigured for database-backed storage
Flashing messages ⌘
- Useful for notifying the user about the outcome of their action
- If you are sure that you won't redirect right now:
response.flash = 'your message'
- A version that survives redirect:
session.flash = 'your message'
- Some other web frameworks can flash multiple messages, web2py can't
- In your view, refer to
response.flash
Exercise: flash code study ⌘
- Find in web2py sources where
session.flash
is moved toresponse.flash
Exercise: HTML in flashed message ⌘
- Can you flash a message with some HTML?
- E.g.: Google is your friend
Exercise: Frozen Wiki ⌘
- Create a controller function and a view that reads a markmin page from a file and renders it
- Use the
MARKMIN()
function to create something printable from the MarkDown-like markup - Get the page name from the arguments, look under private/pages, maybe use
gluon.fileutils.read_file()
- Think about error handling
- Just raise an
HTTP(404)
on any error
- Just raise an
- Use the
- Add a sidebar with links to all pages, mark the current one
- Hint:
gluon.fileutils.listdir()
may be useful
- Hint:
Template hierarchy ⌘
- Two mechanisms: extend and include
- Extend means "use that other file as a master"
- Write
{{ extend 'layout.html' }}
in a view - You may want to set some variables before writing
{{ extend 'layout.html' }}
- Anything that you write before
{{ extend 'layout.html' }}
will be executed or printed in the beginning - Then, the interpreted contents of 'layout.html' will follow.
- Anything that you write before
- In layout.html,
{{ include }}
includes stuff that follows{{ extend 'layout.html' }}
in the slave view
- Write
{{ include 'sidebar.html' }}
means just that: include that other file- Indeed useful for sidebars
Blocks ⌘
- May be more familiar to Django and Flask/Jinja2 users
- In layout.html, define
{{ block foo }}default content{{ end }}
- In slave views, optionally redefine the block
- The new content will replace the original
- You can get the default content using the
{{ super }}
keyword
Database topics ⌘
- Making SQL queries from Python
- Ways to avoid SQL injection
- DAL vs ORM
- Defining tables
- Selecting data
- Joins
- Inserting, updating and deleting data
- Transactions
Making SQL queries from Python ⌘
- DB API 2.0 specification
- A standard which database-interfacing modules follow
- You can connect to the database, issue queries, get results
- There is a mechanism for using prepared queries and substitution of parameters
import sqlite3
db = sqlite3.connect('test.sqlite')
cur = db.cursor()
cur.execute('SELECT id, dname FROM dept WHERE loc = ?', ('Berlin',))
rows = list(cur)
db.commit()
SQL injection ⌘
- A serious vulnerability
- Appears when an application constructs SQL query dynamically by concatenating fixed strings with attacker-provided data
- By using special characters (like ') inside their data, attacker can trick SQL database to interpret their data as additional SQL queries
- Can bypass access restrictions, exfiltrate data, make unauthorized changes
How to avoid SQL injection ⌘
- Answer from 1990s: forbid bad characters in user input
- Bad: how would you register Catherine Anne O'Hara as a user?
- Answer from 2000s: "escape" user data before concatenation so that special characters are properly interpreted as data, not as SQL syntax
- Bad: too hard to keep track what's escaped and what's not
- Too easy to forget to escape data
- Answer from 2010s: don't build queries by concatenating fixed strings with user-provided data
- "Prepared Statement" APIs exist that clearly separate SQL from external data in queries
- Usually (but not always), they stay separate on the wire. If the database doesn't provide this option, the client library will escape and concatenate strings as necessary.
- "Prepared Statement" APIs exist that clearly separate SQL from external data in queries
How to avoid SQL injection, once again ⌘
Don't:
cur.execute("SELECT id, dname FROM dept WHERE loc = '%s'" % (city,))
Do:
cur.execute('SELECT id, dname FROM dept WHERE loc = ?', (city,))
Or just let web2py generate queries for you.
Exercise: perform SQL injection ⌘
- Which value of
city
will cause the bad query to read department with id = 1? The actual department name is not known to an attacker- Hint: one of the possible answers abuses the "OR" operator
DAL ⌘
- Database Abstraction Layer
- Different from a typical ORM
- Does not have a separate class for a row of each table
- Uses a single Row class
- Does not use prepared statements
- Performance implications ☹
- Can be used standalone
from gluon.dal import DAL, Field
db = DAL('mysql://user:password@sql.example.com/db')
# You can have multiple database connections
Why we need DAL ⌘
- SQLite on dev, PostgreSQL in production
- Slight differences in SQL syntax
- E.g. here is how to create a suitable
id
field in SQL:- SQLite:
id INTEGER PRIMARY KEY
(autoincrement is optional, and theAUTOINCREMENT
keyword, if present, must be placed last) - MySQL:
id INTEGER AUTO_INCREMENT PRIMARY KEY
- PostgreSQL:
id SERIAL PRIMARY KEY
- SQLite:
- Construct queries programmatically
- Important if a user should be able to e.g. filter or sort records on multiple criteria
- "IN" queries are just cumbersome with raw DB API
- Access result rows in more object-oriented way
- E.g. use dictionary-like objects with attribute names, not just tuples
- Handle framework-specific objects such as uploads
- Attach validators, interact with forms
Defining tables ⌘
- SQL (for MySQL):
CREATE TABLE blog_post
id INTEGER AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(255),
body TEXT;
- DAL:
db.define_table('blog_post',
Field('title', 'string', length=255),
Field('body', 'text')
)
id
is implied- After that, you'll be able to insert, delete and edit records via the "appadmin" controler
- It asks for the admin password
Primitive data types ⌘
DAL | SQL | Notes |
---|---|---|
'string', length=??? | VARCHAR(???) | default length is 512 |
'boolean' | usually CHAR(1) | '0' or '1' is stored |
'text' | TEXT | |
'blob' | BLOB | stored base64-encoded even if the database can handle binary data |
'integer', 'bigint' | INTEGER | Or a bigger database-specific type |
'float', 'double' | FLOAT, DOUBLE | |
'date', 'time' | DATE, TIME | No timezone information |
'datetime' | TIMESTAMP (DATETIME for MySQL) | This is a DATE and a TIME together |
Other data types ⌘
DAL | SQL | Notes |
---|---|---|
'json' | TEXT | Or a database-specific JSON type |
'password' | VARCHAR(???) | Treated exactly as a string |
'upload' | VARCHAR(???) | A filename is stored |
'list:integer', 'list:string' | TEXT | Or a database-specific type. Think twice before denormalizing! |
Uploads ⌘
- There is a special "upload" field type
- Has some keyword arguments
uploadfolder
- where in the filesystem to place the uploaded file- It will be renamed for safety, but the original filename will be stored
autodelete
- whether to delete the file together with the record (by default False)uploadfield
- it is possible to store files in the database, as blobs, all we need is the name of that blob fieldauthorize
- callback to decide whether the user is authorized to download the file back
Foreign keys ⌘
- SQL:
CREATE TABLE ...
foo INTEGER REFERENCES bar(id);
- DAL:
Field('foo', 'references bar')
- Always references
id
- May or may not be NULL
Constraints ⌘
- There are several ways to specify constraints in the Field constructor
- Keyword arguments
required=True
forbids DAL to insert empty valuesrequires=...
sets up validators for forms- No effect for values inserted not through forms
unique=True
,notnull=True
- Correspond to
UNIQUE
andNOT NULL
SQL clauses - Enforced at database level
- Multiple NULLs do not violate the UNIQUE constraint, except in MS SQL Server
- Correspond to
- There is a bug: using forms, you cannot insert NULLs into unique-but-nullable non-string non-reference (e.g. plain integer) fields
Reverse-engineering an existing database ⌘
- Use
scripts/extract_*_models.py
- Does not work here for sqlite
Default validation ⌘
- web2py inserts default validators for things like
unique
andnotnull
- You may still want to duplicate them in order to override the default error messages
On NULL values ⌘
- By default, web2py does not prohibit NULLs in fields
- It is easy to create a NULL in an integer field
- Just don't enter anything
- There can be a blank option in a field with a list of options
requires=IS_IN_SET(...)
- Selecting the blank option inserts a NULL
- There is some UI-related confusion between a NULL and an empty string in string field
- Other web frameworks recommend making all string fields nonnull because of that
- By default, web2py inserts an empty string
- Add
requires=IS_EMPTY_OR(IS_NOT_EMPTY())
if you want an empty input to become a NULL
Adding requirements ⌘
- It is possible to list requirements in the field definition
- It is also possible to add them dynamically
db.table.field.requires.append(...)
Formatting records ⌘
- Add a
represent=(lambda value, row: ...)
to Field constructor arguments- This is completely optional
- Add a
format=labmda row: ...
to Table constructor elements- As a
Row
is aStorable
which is a subclass of adict
, strings with format expressions also work:format="%(name)s (%(price)s EUR)"
- Quirk: the default
IS_IN_DB(...)
validator is added to the "references" fields only if the referenced fields can be formatted - This also affects the automatic choice of widgets in SQLFORM
- As a
Relations ⌘
- 1:1: don't do this, just put both sides into the same table
- If you must (because of physical storage considerations), just use the same primary key and write helpers or joins to retrieve the other side
- 1:many: add a non-NULL reference field to the "many" side that references the "1" side
- 1:0-1: add a non-NULL unique reference field to the "0-1" side that references the "1" side
- many:many: use an intermediate table that has two non-NULL fields that reference both sides
Normalization ⌘
- Goal: make sure that the very structure of the database makes it impossible to have inconsistent data
- If you don't update your data, denormalization may be OK
- Traditional theory of relational databases doesn't really accept IDs
- Always think "if there is no ID, what else could serve as a key?"
- Useful rule of thumb, approximately equivalent to BCNF (also known as 3.5NF):
- Each attribute must provide a fact about the key (1NF), the whole key (2NF), and nothing but the key (BCNF).
- Both 2NF and 3NF are concerned equally with all candidate keys of a table and not just any one key.
- So help me Codd!
- Each attribute must provide a fact about the key (1NF), the whole key (2NF), and nothing but the key (BCNF).
- Any use of "list:..." type breaks 1NF.
Exercise: shop model ⌘
- You have a shop that sells goods from several categories and delivers them to a customer-supplied address
- We need to keep prices, product images, and orders (including past orders)
- Create database tables for it using the DAL syntax
- Play with it using appadmin
Indexes ⌘
- The book documents them as not natively available
- Work in progress? code landed on 2016-05-26 and is not part of any stable release
- Syntax:
db.mytable.create_index('index_name', db.mytable.myfield_a, db.mytable.myfield_b)
- For unique indexes, pass
unique=True
- Unit tests exist, too!
- Syntax:
Migrations ⌘
- Database needs to evolve due to changing business requirements
- Schema migrations
- Data migrations
- web2py has fully-automatic, non-configurable, schema migrations and no data migrations
- Migrations work by comparing the models with *.table files that were created during the previous migration
- Applied on the first request that notices the mismatch
Migrations in other ORMs ⌘
- Django: migrations (both schema and data) are first-class citizens
- Schema migrations are auto-generated, but can be edited
- SQLAlchemy: no built-in support
- Frequently used with Alembic, which has good documentation for schema migrations
- Schema migrations are auto-generated, but can be edited
- Data migrations are possible but only documented in external blogs
- Frequently used with Alembic, which has good documentation for schema migrations
- Peewee
- Some built-in schema migrations but no built-in tracking of what has been applied
- No clear winner among tracker tools
- Interesting idea: fully automated schema migrations that introspect the current schema and diff it against the models
Inserting rows ⌘
- Fire and forget:
row_id = db.tbl.insert(foo='bar', val=42)
- All fields not explicitly mentioned will get the default values
- You can also create a dictionary, populate it as needed, and pass it:
row_id = db.tbl.insert(**mydict)
- There is no way to insert anything except
VALUES(...)
orDEFAULT VALUES
- I.e. no
INSERT INTO tbl SELECT ...
- I.e. no
- Don't forget to call
db.commit()
- This is automatically called at the end of each web request, but not called if you are using an interactive shell or a script
- An alternative is to call
db.rollback()
Simple selects ⌘
- If there is a table "category", these statements work:
row_id = db.category.insert(name="Motherboards") # let's suppose it returns 42
row = db.category(42) # select by id
row = db.category(name="Motherboards") # select by any attribute
Exercise: try to break inserts and selects ⌘
- From the command line, connect to sqlite://demo.db or just the default sqlite file
- You will need a
databases
folder in your application
- You will need a
- Figure out what happens if you:
- ... violate the UNIQUE constraint
- ... violate the NOT NULL constraint
- ... break the
required=True
constraint - ... break something in
requires=...
- ... attempt to select something that doesn't exist
- ... attempt to select something ambiguous
How other ORMs treat database errors ⌘
- You have observed a bad thing: a database-specific
IntegrityError
exception- Impossible to catch without writing database-specific code or catching all exceptions
- PyLint will hate you if you catch all exceptions
- Impossible to catch without writing database-specific code or catching all exceptions
- Other ORMs (Django ORM, SQLAlchemy, Peewee) automatically convert such exceptions into their own non-database-specific class
More complex selects ⌘
- Build expressions, select the results
rows = db((db.category.name == "Motherboards") | (db.category.name == "Video Cards")).select()
- Watch operator priority! Parentheses are strictly necessary here.
- Comparisons with
None
work and are transformed intoIS NULL
andIS NOT NULL
appropriately. db.category
transforms intocategory.id IS NOT NULL
(which is quite pointless)
How queries work in DAL ⌘
- Relevant classes:
Expression
,Query
- Expression represents something that can be SELECTed or put on the right hand side of the "=" in UPDATE
- Query represent something that can be put in a WHERE clause
- A
Field
(likedb.category.name
) is already anExpression
due to inheritance - It is possible to:
- Combine Expressions using arithmetical operators
- Compare with other Expressions and constants to form Queries
- Perform logical operations against Queries
- Extract parts (substrings, components of the date)
- The resulting Expression contains nothing except a reference to the db connection, first and second operands (as Expressions), and the required operation
- I.e. Expressions and Queries form a tree that explains how to get the result
- Expression does not contain its result!
db.category.name == "Motherboards"
is not a boolean! It's aQuery
Why such syntax ⌘
- The
~
,&
and|
operators are used instead of traditionalnot
,and
andor
operators because the traditional forms cannot be overrideen from Python
How queries work in DAL, continued ⌘
DAL(query)
results in aSet
- This binds a WHERE clause and a database connection
- On a
Set
, you can callselect()
,update()
ordelete()
Preparations for exercises, part 1 ⌘
Create an sqlite database by copy-pasting these SQL statements
$ sqlite3 demo.sqlite
CREATE TABLE dept (id INTEGER PRIMARY KEY, dname VARCHAR(255) NOT NULL, loc VARCHAR(255) NOT NULL);
INSERT INTO dept VALUES(10,'ACCOUNTING','NEW YORK');
INSERT INTO dept VALUES(20,'RESEARCH','LONDON');
INSERT INTO dept VALUES(30,'SALES','PARIS');
INSERT INTO dept VALUES(40,'OPERATIONS','BERLIN');
CREATE TABLE salgrade(id INTEGER PRIMARY KEY, losal DECIMAL(10,2) NOT NULL, hisal DECIMAL(10,2) NOT NULL);
INSERT INTO salgrade VALUES(1,700,1200);
INSERT INTO salgrade VALUES(2,1201,1400);
INSERT INTO salgrade VALUES(3,1401,2000);
INSERT INTO salgrade VALUES(4,2001,3000);
INSERT INTO salgrade VALUES(5,3001,9999);
Preparations for exercises, part 2 ⌘
CREATE TABLE emp (id INTEGER PRIMARY KEY, ename VARCHAR(255) NOT NULL,
job VARCHAR(255) NOT NULL, mgr INTEGER REFERENCES emp(id),
hiredate DATE NOT NULL, sal DECIMAL(10,2) NOT NULL,
comm DECIMAL(10,2), deptno INTEGER NOT NULL REFERENCES dept(id));
INSERT INTO emp VALUES(7839,'BUSH','PRESIDENT',NULL,'1981-11-17',5000,NULL,10);
INSERT INTO emp VALUES(7698,'BLAIR','MANAGER',7839,'1981-05-01',2850,NULL,30);
INSERT INTO emp VALUES(7566,'PUTIN','MANAGER',7839,'1981-04-02',2975,NULL,20);
INSERT INTO emp VALUES(7902,'TOOSK','ANALYST',7566,'1981-12-03',3000,NULL,20);
INSERT INTO emp VALUES(7369,'THATCHER','CLERK',7902,'1980-12-17',800,NULL,20);
INSERT INTO emp VALUES(7499,'BAROSSO','SALESMAN',7698,'1981-02-20',1600,300,30);
INSERT INTO emp VALUES(7521,'WALTON','SALESMAN',7698,'1981-02-22',1250,500,30);
INSERT INTO emp VALUES(7654,'CHIRACK','SALESMAN',7698,'1981-09-28',1250,1400,30);
INSERT INTO emp VALUES(7782,'MERKEL','MANAGER',7839,'1981-06-09',2450,NULL,10);
INSERT INTO emp VALUES(7788,'CARNEGIE','ANALYST',7566,'1982-12-09',3000,NULL,20);
INSERT INTO emp VALUES(7844,'GATES','SALESMAN',7698,'1981-09-08',1500,0,30);
INSERT INTO emp VALUES(7876,'FORD','CLERK',7788,'1983-01-12',1100,NULL,20);
INSERT INTO emp VALUES(7900,'BUFFETT','CLERK',7698,'1981-12-03',950,NULL,30);
INSERT INTO emp VALUES(7934,'ELISON','CLERK',7782,'1982-01-23',1300,NULL,10);
.exit
Copy the resulting demo.sqlite file into applications/welcome/databases/demo.sqlite
Exercises from the SQL course ⌘
- Run an interactive shell (best done with IPython installed):
$ python2 web2py.py -S welcome
- Connect to the database
- Define tables
- Use
migrate=False
argument fordefine_table
to avoid errors due to tables that already exist
- Use
- Go through SQL statements and exercises from the SQL Programming training, produce the same result in DAL
- Consult the next few slides for help with DAL syntax
- Some exercises are not doable with DAL, please identify them and explain what's wrong with them
- It's a good idea to use
as_list()
onselect()
results to see them
db(db.salgrade).select().as_list()
- Use
db._lastsql
to look at the last executed SQL statement - Or, use
_select()
instead ofselect()
to preview the SQL without executing it
SQL features not available natively ⌘
- Typecasting to string
- Concatenation of non-string expressions with strings
- To concatenate strings, use
+
- Anyway you can select parts separately and concatenate them in Python
- To concatenate strings, use
- Concatenation where the first operand is a fixed string
- Subtraction where the first operand is a fixed number
- Date and time manipuation other than
.year()
,.month()
and.day()
- Do it in Python
- BETWEEN
- Reformulate in terms of two comparisons
- Selecting
COUNT(*)
together with something else during aggregation- Use
COUNT(id)
for the same result
- Use
UNION
,INTERSECT
,EXCEPT
- Python-level alternatives exist as
|
,&
operations onRows
- Python-level alternatives exist as
- Subqueries
- The list above is incomplete!
More DAL syntax ⌘
SELECT a, b, c FROM tbl
db(db.tbl).select(db.tbl.a, db.tbl.b, db.tbl.c)
SELECT DISTINCT a, b FROM tbl
db(db.tbl).select(db.tbl.a, db.tbl.b, distinct=True)
SELECT a + b AS x FROM tbl
db(db.tbl).select((db.tbl.a + db.tbl.b).with_alias('x'))
with_alias()
can also be applied to tables - important for self-joins
IFNULL
: rethink in terms ofCASE
orCOALESCE
- E.g. to convert them to 0:
(db.tbl.a != None).case(db.tbl.a, 0)
- Simpler solution:
db.tbl.a.coalesce(0)
- E.g. to convert them to 0:
... WHERE a IN (10, 20)
db.tbl.a.belongs((10, 20))
... WHERE a LIKE '%A%'
db.tbl.a.like('%A%')
, ordb.tbl.a.contains('A')
.startswith(...)
,.endswith(...)
are also available
Sorting and aggregation ⌘
SELECT ... ORDER BY a ASC, b DESC
...select(..., orderby=db.tbl.a | ~db.tbl.b)
AVG(a)
and other aggregate functionsdb.tbl.a.avg()
and similar
COUNT(a)
(counts non-NULL values)db.tbl.a.count()
COUNT(DISTINCT a)
(counts distinct non-NULL values)db.tbl.a.count(distinct=True)
SELECT ... GROUP BY a HAVING a > 100
...select(..., groupby=db.tbl.a, having=db.tbl.a > 100)
- HAVING omits unwanted groups, just like WHERE omits unwanted records
If the whole result is not interesting... ⌘
- Get the first row of the result, or
None
select(...).first()
last()
also exists- Think about sorting
- If there are multiple rows, this retrieves them all and throws away everything except the first
- Other ORMs allow special syntax for you to say "this query must return exactly one result, raise an exception if this is not the case"
- Only count results, don't select
- Use
.count()
instead of.select()
- Use
- Check whether the result would be non-empty
- Use
.exists()
instead of.select()
- Use
- Implement pagination
select(..., limitby=(a, b))
- Asks the database to find b first rows, discard the first a, and send the result to web2py
Selecting from multiple tables ⌘
- It just works
db()._select(db.emp.ename, db.dept.dname)
SELECT "emp"."ename", "dept"."dname" FROM "emp", "dept";
- This yields a cartesian product
- Add conditions to make the result useful
db(db.emp.deptno == db.dept.id)._select(db.emp.ename, db.dept.dname)
SELECT "emp"."ename", "dept"."dname" FROM "dept", "emp" WHERE ("emp"."deptno" = "dept"."id");
Explicit joins ⌘
- A join condition is a parameter to
select
db(where).select(..., join=db.tbl2.on(query))
- This is translated to:
SELECT ... FROM "tbl1" JOIN "tbl2" ON <query> WHERE <where>
- Which is equivalent to:
SELECT ... FROM "tbl1", "tbl2" WHERE <query> AND <where>
- You can provide a list as a
join=...
argument to produce multiple joins - Similar syntax works for left joins:
select(..., left=db.tbl2.on(query))
- Left joins also provide rows from the left table that don't have a pair from the right table (columns from the right table become NULLs)
- Use
WHERE tbl2.id IS NULL
to select only these orphaned rows
- Use
- Left joins also provide rows from the left table that don't have a pair from the right table (columns from the right table become NULLs)
- Table on the left to
JOIN
orLEFT JOIN
is determined by magic code in_build_joins_for_select()
- It is, roughly, a cross join of all tables mentioned anywhere in the query, fields, or join expressions, but not mentioned just before
.on(...)
- It is, roughly, a cross join of all tables mentioned anywhere in the query, fields, or join expressions, but not mentioned just before
Deleting rows ⌘
- If you have a row:
row.delete_record()
- If you have a query:
db(query).delete()
- No tunable parameters
- Cannot delete from join
- There are
._before_delete
and._after_delete
callbacks that can be defined on tables
Delete and references ⌘
- What happens if you delete some row that has rows in other tables that reference it?
- This depends on the
ON DELETE
parameter that you pass in SQL when creating the referencing table- Exposed as the
ondelete=...
keyword argument in theField
constructor
- Exposed as the
- The default in web2py is
CASCADE
, which means "delete every reference that would be broken" - Other options:
SET NULL
(obvious),RESTRICT
(fail immediately),NO ACTION
(fail at the end of the transaction if anything is still broken)- The
_before_delete
and_after_delete
callbacks are not called for rows deleted because of the cascade
- The
- This depends on the
Updating rows ⌘
- If you have a
Row
:
row.foo = 42
row.bar = "what?"
row.update_record()
- Alternatively:
row.update_record(foo=42, bar="what?")
- Bulk or atomic update without
Row
objects:
db(...).update(foo=42, bar="what?")
update_or_insert ⌘
update_or_insert
: update an existing record, or insert a new one if it doesn't existdef update_or_insert(self, _key=DEFAULT, **values): ...
_key
(the first positional parameter) can be adict
or aQuery
, or omitted altogether_key
is used to find the record that has to be updated- If omitted,
**values
are used both for searching and for insertion
- If omitted,
- returns the id of the inserted record, or
None
if there was only an update
- The method is racy
- look at the implementation
- think what happens if a new record is inserted in a different thread at the worst possible time
- also think about unique keys
Running custom SQL ⌘
db.executesql("SELECT foo FROM bar WHERE id = 123")
- Supports placeholders but does not document their exact syntax
- Actually, it depends on the underlying database
- Can return result as rows, as a list of tuples, or as dicts
Transaction isolation levels ⌘
SET TRANSACTION ISOLATION LEVEL
READ UNCOMMITTED
(allows reading of uncommitted values)READ COMMITTED
(still allows rereading the value and finding that it has changed) (default in PostgreSQL)REPEATABLE READ
(but still over a non-repeatable set of rows) (default in MySQL)SERIALIZABLE
(no anomalies) (default in sqlite)
- Because of these anomalies, sometimes it is necessary to tell the database to use a write-lock even if only a
SELECT
is done- SQL:
SELECT ... WHERE ... FOR UPDATE
- DAL:
db(...).select(..., for_update=True)
- SQL:
- Beware: at transaction isolation levels
REPEATABLE READ
andSERIALIZABLE
, the database can throw a "serialization failure" error at any time there is a conflicting write- web2py does not handle this at all
Atomic updates ⌘
- Useful for handling votes
- Watch the generated SQL
- It may be wrong to get the current number of votes, add 1 in Python, and write back
- At which isolation levels is this definitely wrong?
- Read about "write skew" anomaly that happens in PostgreSQL
- Another good reading on the same topic
- This is interesting because it is not one of the anomalies mentioned in the SQL standard
- Best solution:
UPDATE voting SET votes = votes + 1 WHERE ...
- Can this be expressed in DAL?
Automatic referenced attribute access ⌘
- This works on reference fields
dname = db.emp(ename="BUSH").deptno.dname
- Automatically issues a second
SELECT
to fetch the department name - This pattern may be OK if attribute access is not in a loop
- Performance problems otherwise
- Better reformulate in terms of a join
- Other ORMs have things like
select_related
to formulate theJOIN
automatically
Authentication, Authorization & Access Control ⌘
- Based on users and groups
- Users belong to groups
- You can manage them using appadmin
- Groups can have permissions to create, read, update, delete records in tables
- Enforced by the deprecated "CRUD", or by
auth.accessible_query()
- Enforced by the deprecated "CRUD", or by
- You can create custom permissions
- There are decorators to check those permissions
- Users belong to groups
- Registration
- With or without "email as username"
- With various styles of email confirmation
- With or without CAPTCHA (Recaptcha or custom)
- Sending invitations (aka bulk registration) - disabled by default
- Login, logout
- Optinal two-factor authentication
- Editing user profile
- With arbitrary extra fields
- Resetting password
- Retrieving login by email address
- Impersonation of any user by admins
- Actually
has_permission('impersonate', 'auth_user', user_id)
- Actually
Default auth tables schema ⌘
Authentication controller and view ⌘
- In a model:
from gluon.tools import Auth
auth = Auth(db, host_names=myconf.get('host.names'), ... other settings ...)
auth.define_tables(... various settings ...)
auth.settings.foo = 'bar'
auth.settings
defaults come fromgluon.tools.Auth.default_settings
- Disable what's unneeded in
auth.settings.actions_disabled
- You can also customize all messages
- Disable what's unneeded in
- In a controller:
def user():
return dict(form=auth())
- See the comments in the welcome app about available subpaths
- The template lives in
templates/user.html
- Inspect
request.args(0)
to see whether this is registration, login or something else
- Inspect
Adding new fields to auth tables ⌘
- If you want a proper username (instead of "email is username"):
auth.define_tables(username=True)
- If you want completely custom fields:
auth.settings.extra_fields['auth_user']= [
Field(...),
Field(...),
Field(...)]
- New custom fields will also be visible on the Profile page
Tracking who changed the record ⌘
- There is
auth.signature
- Just a canned collection of
is_active
,created_on
,created_by
,modified_on
,modified_by
fields with suitable defaults - Add it to your table as you would add a Field
- No further code is needed due to automatic
default
andupdate
keyword arguments on the added fields
- Just a canned collection of
Tracking old versions of rows ⌘
- Only one of:
form.process(onaccept=auth.archive)
if you want to cover only changes done via one specific formauth.enable_record_versioning(tables)
if you want to cover all changes exceptON DELETE CASCADE
- Only works for tables with
is_active
andmodified_on
- Old versions of
tbl
rows are stored intbl_archive
- If you need to access them, define the model:
db.define_table('tbl_archive',
Field('current_record', 'reference tbl'),
db.tbl)
- Rows are never deleted from
tbl
, but marked asis_active=False
- All queries are transparently modified to exclude such rows
Access Control passive security primitives ⌘
auth.user # a copy of the relevant row drom db.auth_user, or None
auth.is_logged_in() # returns a boolean
auth.is_impersonating() # see if the current user is a mind-controlled zombie
auth.has_membership(group_id=None, user_id=None, role=None) # logs all calls!!!
auth.has_permission(...) # checks record-level permission
auth.accessible_query(name, table) # a DAL query that filters out inaccessible records
- Note:
auth.user.id
will crash for the not-logged-in case. Useauth.user_id
if you want to get aNone
instead
Access Control active security primitives ⌘
auth.logout_bare() # programmatically log out
auth.login_user(user_id) # programmatically log in, pre-authenticated
auth.login_bare(username, password) # check password and maybe login
auth.register_bare(...) # programmatically register a user
auth.add_group(role, description)
auth.del_group(group_id)
auth.add_membership(user_id=..., group_id=...)
auth.del_membership(user_id=..., group_id=...)
auth.add_permission(...)
auth.del_permission(...)
Access Control decorators ⌘
- Available decorators
@auth.requires_login()
@auth.requires_membership('some_group')
@auth.requires_permission('permission', 'subject')
@auth.requires(boolean_expression)
- Put them on your controller functions or on any auxiliary functions that you want to protect
- Don't use more than one decorator per function - use
@auth.requires
instead
- Don't use more than one decorator per function - use
- On failure, they apply the first suitable action from the list:
- respond with
401 unauthorized
to Ajax requests - redirect to URL passed in the
otherwise=...
keyword parameter if the parameter exists - respond with
403 Forbidden
to REST requests - redirect to the login form otherwise
- respond with
Protecting the whole controller ⌘
- Add some code at the top of it, outside of any function
- See a working example in
appadmin.py
- See a working example in
Password security basics ⌘
- Knowledge of a password gives you access to some protected resource
- There are crooks that want others' passwords
- Threat model:
- Online guessing: enforce restrictions (minumum length, ...) in validators
- Network sniffers: enforce https
- Keyloggers: use two-factor authentication
- Stolen database: store passwords securely, to withstand offline attacks
- Not as plain text (obviously)
- Not as mere hashes (due to dictionary attacks)
- Hackers have hashes of all words and will just search for them
- This also gives out whether two passwords are identical
- Not as simple salted hashes (still easy to brute-force on a GPU for human-friendly password lengths)
- Brute-forcing of an 8-character password (48-bit keyspace) hashed via SHA1 with a salt takes 5 days on a single GPU
- The current state of the art is PBKDF2, and web2py uses it
PBKDF2 ⌘
- Password-based key derivation function
- RFC2898
- Irreversible
- Tunable complexity (via the number of hash iterations)
- Makes password checking (and therefore offline cracking) deliberately slow
- Now your login form is a DoS attack vector
- Now any API that accepts authentication via the user's login and password is also a problem!
- Solution: don't expose APIs that authenticate via the user's login and password
- Use JSON Web Tokens instead
- Makes password checking (and therefore offline cracking) deliberately slow
Interoperability with other web frameworks ⌘
- Various web frameworks store PBKDF2 salted hashes differently
- Django:
pbkdf2_sha256$30000$DZRxycy7VktY$V9LL/vMnUyn7OeKU4h4dpvLvWPTNOzl+NsPw95Fm9eQ=
- algorithm $ number of iterations $ salt $ base64(hash)
- The length of the PBKDF2 hash used by Django is the same as the length of the native hash output
- Web2py:
pbkdf2(1000,20,sha512)$bceea2673a6a024e$7a3fe52afaf373bc5a32aa60d38687845e061a18
- pbkdf2(iterations, length, algorithm) $ salt $ hex(hash)
- Django:
- In both cases, the password is
crackme!
Exercise with cryptography ⌘
- Can you take apart the Django-provided PBKDF2 hash and put it into web2py auth_user table so that web2py recognizes it?
pbkdf2_sha256$30000$DZRxycy7VktY$V9LL/vMnUyn7OeKU4h4dpvLvWPTNOzl+NsPw95Fm9eQ=
- Why is the reverse impossible out of the box?
i18n in web2py ⌘
- The
T()
function translates text- The language is taken from HTTP headers sent by the client
- Use
T.force('it')
to force Italian /app/lang/controller/function
also works iflanguages
parameter is set properly inroutes.py
- The return value is not always a string
- By default, it is a lazy translation object
- It remembers "OK, we need to translate this string into this language"
- Translation happens when a string value is needed
- This involves a cache lookup, then a mechanism to locate the appropriate *.py file with translations, and import the dict from it
Translating text with parameters ⌘
- Don't:
T("Increase salary of ") + str(employee)
- Translators may need to insert the employee name somewhere in the middle!
- Do:
T("Increase salary of %(employee)s", symbols=dict(employee=employee))
- The "%s" style is supported, too
- Needs a tuple or a list as
symbols
- If there is only one substitution parameter, provide it directly:
T("Increase salary of %s", employee)
- Needs a tuple or a list as
- The "%s" style is supported, too
Pluralization ⌘
- Just put the word that needs pluralization in a
%%{marker}
- web2py will look for the first parameter in
symbols
and use it to correctly pluralize the word "marker" - Use
%%{marker[1]}
or%%{marker(markers)}
if you need to pluralize according tosymbols[1]
orsymbols['markers']
- web2py will look for the first parameter in
- For languages with regular rules for making plural forms, web2py does it automatically
- For other languages, you have to maintain the explicit dictionary of plural forms in
languages/plural-*.py
Translation of blocks of HTML ⌘
- This happens in views
<strong>{{=T("There are no modules")}}</strong>
- A problem arises if the translatable string includes some HTML tags (e.g. a <span> to highlight something, or a link)
- web2py authors mostly manage to avoid it, but sometimes they don't:
<div class="editor-bar-bottom" style="margin-top:9px;">
<a class="editbutton btn" href="{{=URL(...)}}" id="restore">{{=T('restore')}}</a>
{{=T('currently saved or')}}
<a class="editbutton btn" href="{{=URL(...)}}" id="revert">{{=T('revert')}}</a>
{{=T('to previous version.')}}
</div>
- No good solution here
- Discussion of the same issue in the context of Django
Why an English "translation" is needed ⌘
- web2py does not assume that the source text is in English
- It is in unknown language, so to say
- What if the browser says
Accept-Language: en-US,en;q=0.8,ru;q=0.6
?- If a Russian translation exists but English doesn't, Russian will be returned
Translating identical strings differently ⌘
- Sometimes useful
- Include context into the original string
- Here is how:
T("Phrase ## context")
- Here is how:
i18n in web2py vs other frameworks ⌘
- web2py does not use gettext due to concerns about thread-safety
- "We cannot use gettext because it translates/pluralizes assuming the entire process needs the same language."
- Misguided? Or at least very strange for a modern framework!
- There is also a concern about gettext being too difficult to set up on Windows
- Irrelevant now, because there is Babel
- There are tools to convert between web2py *.py translation files and gettext *.po files
- This is useful if you want to set up something like pootle for your translators or use Transiflex instead of giving them access to the admin interface
Form topics ⌘
- HTML helpers
- Validators
- Forms
- SQLFORM
HTML helper classes ⌘
- Generate tags
- Have the
.xml()
method - Full list is in
gluon.html
module - No HTML5-specific tags, but this is not a big problem
- General pattern:
- Call the constructor
- Pass inner text and nested tags as positional arguments
- Text will be escaped properly
- Pass attributes as keyword arguments with a leading underscore
Special helper classes ⌘
CAT
: not a tag. Just concatenates strings and inner tags.A
: way too smart. Can generate not only regular links, but also AJAX!INPUT
,TEXTAREA
,SELECT
: not only represent themselves, but also can react to user inputFORM
: can validate itself
Exercise ⌘
- Build a helper to display a slider from jQuery UI
- Don't care about validation or getting the data back yet
Special methods of INPUTs ⌘
INPUT
,TEXTAREA
,SELECT
constructors have keyword argumentsvalue
: initial valuerequires
: None, a validator, or a list of validators
Validators ⌘
- Tasks:
- Convert valid input from HTML strings to appropriate Python data types
- In other frameworks, can easily return ORM objects!
- Provide error messages for invalid inputs
- Convert valid input from HTML strings to appropriate Python data types
- Callables that take a string and return a pair:
(converted_value, error)
- If
converted_value
is significantly different from the input (e.g. not of the same type), think about providing theformatter
function that performs the required conversion back to string
- If
- Factories for them are in
gluon.validators
- The appropriate error messages can often be supplied as an argument,
error_message='...'
- The error message should not contain a call to
T()
, web2py will call it for us - Good - you can always override the framework-default error messages, and translate them independently from the framework
- The error message should not contain a call to
- The appropriate error messages can often be supplied as an argument,
- You can write your own
How to use validators ⌘
- When you first call the validator class (e.g.
IS_MATCH
), you create an instance of the validator and thus store all tunable parameters - The result is also callable, and the call performs validation of the argument according to the stored parameters
- With validator instances, you can do this:
- Pass as
requires=...
keyword argument when creating aField
in a table - Pass as
requires=...
keyword argument when creating anINPUT
in a form - Directly assign to
db.tbl.field.requires
- In all cases, a list of validators is acceptable, and each validator transforms the input for the next one
- Pass as
Available validators ⌘
- See
gluon.validators
- The most important ones (that you want to insert manually) are:
IS_NOT_EMPTY()
- self-obviousCLEANUP()
- removes blanks from the edges of the string, removes bad characters (those matched by the supplied regular expression)IS_INT_IN_RANGE
- for enforcing minimum and maximum, and disallowing non-numeric inputIS_MATCH()
- verifies the input according to a regular expressionIS_LENGTH()
- sets minimum and maximum length of a field- Or a size of the uploaded file
IS_IMAGE()
- implements some crude checks on uploaded files, but doesn't do full decodingIS_IN_SET()
- useful when there are several valid options- When used on a field, makes the default widget a select
- There is no
IS_NOT_IN_SET()
IS_IN_DB()
- useful for fields that reference something- Again, changes the default widget to a select
IS_NOT_IN_DB()
- for avoiding duplicates
Validators that result in a select ⌘
IS_IN_SET()
,IS_IN_DB()
- They result in a select only when they are the first in the chain
- HTML looks like this
<select>
<option value="3">Business</option>
<option selected="selected" value="2">Premium</option>
</select>
- The value is sent to the server and validated, but we also need a human-readable representation
- That's why the
labels=...
keyword argument forIS_IN_SET()
- Can be a list of strings
- That's why the
label=...
keyword argument forIS_IN_DB()
- Can be a format string, or a lambda, or a field
- That's why the
- There is also a
zero=''
keyword argument that adds the invalid "nothing selected" option- Pass a
None
to get rid of it
- Pass a
Limiting available choices for IS_IN_DB() ⌘
- Sometimes the value should not just be in DB, but be in certain query output
- Here is what to do:
requires=IS_IN_DB(db(query), db.mytable.myfield)
- Here is what to do:
Exercise ⌘
- There is no
IS_NOT_IN_SET()
- Implement it!
Exercise ⌘
- Write a validator for selecting some pages in the print dialog
- Play with Chrome's dialog to gather the idea what should be valid and what shouldn't
- The returned data should be a list of integers representing the page numbers
- Test it
Special methods of FORMs ⌘
form.accepts(request.vars, ...)
- The underlying primitive for other methods
- Passive by default for
FORM
- Returns False if something is invalid or the form has not been submitted yet
- Modifies the form if there are errors unless you pass
hideerror=True
as a keyword argument - Modifies the form anyway to add hidden
_formkey
and_formname
inputs- That's CSRF and double-submission protection
form.validate()
- Tests whether the form accepts data, then performs actions specified in
onsuccess
,onfailure
oronchange
(SQLFORM only) keyword arguments- Thus active!
- Handy for flashing messages or redirecting
- Tests whether the form accepts data, then performs actions specified in
form.process()
- Same as
form.validate()
but returns the form - Thus, also active
- Same as
Whole-form validation ⌘
form.accepts()
validates fields one-by-one- I.e. no whole-form validation, no validation of inter-field consistency
- Actualy, even no concept of errors that are caused by the form as a whole and not one of its fields
- Solution:
onvalidate
keyword argument- Also abused for external actions like sending email
- Can be either a callable, or a list of callables, or a dict
- The dict has to have only
onsuccess
,onfailure
andonchange
keys- Their values must be callables or lists of callables
- The dict has to have only
- The callables will be called after validation, with the form itself as an argument
- A chance to add to
form.errors
, this will automatically mark form as invalid
- A chance to add to
Special properties of FORMs ⌘
form.vars
- Validated and converted values
form.errors
- Storage of errors: keys are input names, values are error messages
- Empty if there are no errors
form.accepted
- Just a boolean with obvious meaning
CSRF ⌘
- Cross-Site Request Forgery
- A hacker makes a form on his website
- The button says "View my photos"
- The form actually submits to your bank (in hope that you are logged in) and asks it to transfer $100 to the hacker
- web2py protects against CSRF if you use FORMs and call
.accepts()
- The hacker has no access to your bank's session and thus cannot guess the correct
_formkey
- The bank will reject forms with incorrect or missing
_formkey
even if you are logged in
- The hacker has no access to your bank's session and thus cannot guess the correct
Displaying forms ⌘
- Simplest possible method:
{{=form}}
- Requires putting
LABEL
s in the form
- Requires putting
- You can write the HTML yourself, but then don't forget
{{=form.hidden_fields()}}
Displaying errors ⌘
{{=form}}
will do this for you- The error message will be in a
<div class="errorwrapper><div class="error" id="$name__error">...</div></div>
, just after the input - The
invalidinput
class will be added to the input itself
- The error message will be in a
- Or, you can do it yourself as you want, but then pass
hideerror=True
toform.accepts()
- Use
form.errors
- Use
More than one form per page ⌘
- Possible!
- Call
form.accepts()
for each form - Provide a unique
formname='...'
keyword argument in each call - This will add a hidden
_formname
input, and use it to see whether this is the form that the user submitted
- Call
SQLFORM ⌘
- A way to display and validate a model-based form
- To create a form suitable for inserting a new record into table
tbl
:form = SQLFORM(db.tbl)
- To create a form suitable for updating an existing record:
form = SQLFORM(db.tbl, record)
- Add
deletable=True
to show a "Check to delete" checkbox - Add
readonly=True
to make the form read-only
- Validators from the table definition apply
Processing SQLFORMs ⌘
FORM.accepts()
is passive if there is noonvalidate
SQLFORM.accepts()
andSQLFORM.process()
are active: they do database I/O!SQLFORM.validate()
is still passive- That's a good thing if you need some post-processing and don't want to use
onvalidate
- That's a good thing if you need some post-processing and don't want to use
Widgets ⌘
- SQLFORM decides what kind of input tag to produce for each field
- It looks at the field type and validators
- It is possible to override the decision by adding a keyword argument,
widget=...
, to the field definition db.tbl.field.widget = ...
also works
- Available widgets: see
gluon.sqlhtml
- Use as:
gluon.sqlhtml.RadioWidget.widget
- Use as:
Dealing with uploads ⌘
- web2py will display upload fields as links or inline images, depending on the extension
- Links need to point to a controller that does the appropriate security checks and gives the content of the file
@cache.action()
def download():
"""
allows downloading of uploaded files
http://..../[app]/default/download/[filename]
"""
return response.download(request, db)
- By default, expects the filename be of the form:
table.field.XXXXXX.ext
- Parses the filename out from
request.args[-1]
- Creates a ticket if a user tries to download
auth_user.email.bob@example.com.jpg
- BUG
- Parses the filename out from
How widgets work ⌘
- The widget function takes the database field, the value to display, and additional HTML attributes
- Produces some HTML helpers
Limiting the available fields ⌘
- Forgetting to limit the fields of autogenerated forms is a cause of many security vulnerabilities
- Think about someone setting their own
is_manager
status toTrue
- Think about someone setting their own
- In field definition:
writable=False
hides the field from create forms (update forms still show it as read-only)readable=False
hides the field from read-only forms, too- These properties can be assigned post-factum in controllers
- In the
SQLFORM
call:- Use the
fields=[...]
argument
- Use the
Adding extra fields ⌘
- Pass them in the
extra_fields
keyword argument of SQLFORM constructor- They can have validators, as usual
- You can take action upon the values in the
onvalidate
callback
ID tampering ⌘
- Display an SQLFORM
- F12 (developer tools), modify the id, attempt to submit
- Result: ticket!
- From a sysadmin viewpoint, this is spam. There is no fixable bug in the application, yet there is an unactionable ticket!
- Other web frameworks have no protection against id tampering
- Either the modified request is valid, or it will be rejected by some validator
- Result: ticket!
Optimistic offline locking ⌘
- Display an SQLFORM for editing some record
- Call
form.process(detect_record_change=True)
- ...which is not the default
- Load the form in two tabs
- Try to submit both
- The second submission won't work
- Detection is based on MD5 sum of concatenation of all fields of the record
- Message can be customized through
message_onchange
keyword argument - No built-in equivalent in other frameworks
Exercise: a real-world weird registration form ⌘
- This comes from SafeDNS
- They have some "billing plans"
- A billing plan is pre-defined set of available functionality with a known price
- Each user has a billing plan and, for non-free plans, date when the user's subscription expires (unless extended by paying money)
- During registration, you choose your desired plan and get 15 days of subscription for free
- It is possible to apply a promotional code during registration
- In this case, the selected billing plan is ignored
- The set of available promotional codes is managed by site managers
- Each promotional code has an associated billing plan to give to the user, and the number of days to give him/her for free instead of the default 15
- There are some billing plans that are not normally advertised, but available only via a promotional code
- And now you have to implement user registration with all these weird requirements
Some hints ⌘
- Look into
gluon.tools.Auth.register()
- It builds and processes the form
- The exercise is mostly about not fighting too much with its logic
- There is no way to add
extra_fields
to the form- So we need some other solution
- The billing plan should be settable in the registration form but not elsewhere
- How would our application know that it is going to show a registration form?
- Find out how to attach
onvalidation
to the registration form- Set the expiration date there
- Want a simple confirmation form?
- There is
FORM.confirm('Are you sure?')
- There is
- Want some validated inputs not related to your database, but
FORM
is too low-level?- Use
SQLFORM.factory()
, supply fields as arguments
- Use
- Want a table, not a form?
- There is
SQLTABLE
- There is
- Want something smart, with Create, Edit and Delete buttons, and pagination?
- Use
SQLFORM.grid()
- Use
Client-side topics ⌘
- Including CSS and JS
- AJAX in web2py
- Form styling
- Autocomplete
Including CSS and JS ⌘
response.files.append(URL('static', 'something.js'))
- In the template, call
response.include_files()
- So that they can be placed in the appropriate place in the document (usually the head)
Versioning static files ⌘
URL('static', 'something.js')
does not necessarily return"/static/something.js"
- Client-side caching is an important loading-time optimization
- Client is told to cache all JS and CS files by setting the
Expires:
header to point far into the future- Problem: what if the file changes?
- Solution: don't reuse the URL
- Include some version number in it
- web2py does this for you if you set
request.static_version
- Manually in
models/db.py
- Easy to forget
- Solution: don't use web2py's static versioning
- Your frontend web developer probably already has something to offer
- Manually in
Static asset optimization ⌘
- It's inefficient to load many long JS/CSS files
- Especially when
@import
is used in CSS, because it breaks pipelining
- Especially when
- web2py can minify, concatenate and inline your static assets
- But "inline" is used in a different meaning than usual
- Usual meaning: replace
@import
rules in your CSS with actual contents of the imported files- This is usually wanted, so that the browser can load all the CSS in one request, cache it and not request it again anymore
- web2py meaning: put that JS or CSS into the
<script>
tag, so that thesrc
attribute is not needed anymore- Prevents caching, loads tons of assets with each request, don't do it
- Usual meaning: replace
- But the "minification" is quite limited
- Only strips comments and unneeded whitespace
- State-of-the-art minifiers also rename variables to shorter names and rewrite expressions to shorter equivalent forms
- And your frontend developer has some other tricks in the bag, like web sprites, that are natively supported by his toolchain
- Conclusion: trust him. Take minified/concatenated/compressed JS from him, let him alter references to these static assets in your templates.
- Drawback: this adds a build step.
- Conclusion: trust him. Take minified/concatenated/compressed JS from him, let him alter references to these static assets in your templates.
- But "inline" is used in a different meaning than usual
CSS in the welcome app ⌘
- Bootstrap v3.3.5
- Custom CSS that makes the calendar widget work
- Some web2py-specific styles in
web2py.css
- worth looking in detail and customizing- These styles customize various tags, including
<h1>
- Kind of stuff that you expect to see afer a CSS reset, or similar in purpose to normalize.css
- Many other web2py-specific styles
- These styles customize various tags, including
- A more modern version of
web2py.css
is also available asweb2py-bootstrap3.css
- All of them are minified and come without sources (possible license issue)
JS in the welcome app ⌘
- analytics.js - for Google and not only!
- Bootstrap has some JS components
- Calendar
- jQuery v1.12.2
- modernizr, respond
- They implement new tags and CSS attributes in old browsers, and also fix their bugs
- share.js for sharing on social networks
- web2py.js - API to be used by applications (to be discussed later in more detail)
- web2py-bootstrap3.js - some web2py-specific behaviour
Purpose of JS and CSS frameworks ⌘
- Solve common problems regarding site layout and client-side functionality
- Multi-column responsive layouts
- Modern look and feel
- Hiding differences between browsers' default rendering of HTML
- Simplifying tasks such as DOM manipulation and AJAX
- Providing new kinds of UI controls that don't exist in plain HTML5
How web2py defaults compare with the existing best practice ⌘
- Both Bootstrap and jQuery are industry standards
- Other industry standard CSS and JS frameworks exist
- web2py uses some jQuery APIs internally, but can be decoupled from Bootstrap
- It is sometimes possible to use multiple JS libraries in the same project (e.g. jQuery + React works as long as they don't modify the same DOM)
- It's fine to use React for UI and jQuery for AJAX - that's what the official React tutorial did in the past
- Still, some web developers may object
- There is a lightweight alternative CSS framework by the same author as web2py, stupid.css, also available as a replacement welcome app
- It is sometimes possible to use multiple JS libraries in the same project (e.g. jQuery + React works as long as they don't modify the same DOM)
- You should be aware of HTML5 boilerplate
CSS classes that must be defined ⌘
- That's an exercise for the reader
- web2py sometimes adds various classes to elements by itself
- Example: the
invalidinput
class for invalid inputs, anderrorwrapper
anderror
for surrounding divs - Find them all in the source code
grep
is your friend- But note that some classes appear only in circumstances where the developer asks for them
- And some are mentioned only in examples
- E.g. the Bootstrap-specific
btn-default
class is used only if the form rendering style is set tobootstrap3_stacked
, and avoidable by setting it to anything else
AJAX ⌘
- Asynchronous JS and XML
- The underlying JS technology is
XMLHttpRequest
- jQuery contains a number of convenience wrappers
Doing AJAX with jQuery ⌘
- It still works, and is not specific to web2py
- $.ajax() is a general-purpose method, to be used when others are not suitable
- $.get() and $.post() perform GET and POST requests asynchronously
- They have various callbacks that are called when the request completes
- Don't forget to call ajaxError():
$(document).ajaxError(function() { ... })
to be notified about errors - Alternatively,
$.get(...).done(function() { ... }).fail(function() { ... })
Loading HTML fragments with AJAX ⌘
- An important use case is when an AJAX request is used for loading some HTML fragment
- The helper to use here is $.load()
$( "#result" ).load( "ajax/test.html" );
- It's a good idea to add error handling, and maybe do something else (like removing the "load..." link) on success
web2py ajax function ⌘
- Defined in
web2py.js
ajax(url, [name1, name2, ...], target)
- Passes the contents of input fields name1, name2, ... as parameters to a given URL via a POST request
- Loads the result as HTML, replaces everything inside #target with it
- For such HTML fragments, web2py uses views with the
.load
extension - Or, if the traget is ":eval", evals the returned javascript
- For such HTML fragments, web2py uses views with the
Magic attributes of the A tag ⌘
- To create links that do AJAX, use
{{=A("...")}}
with magic keyword argumentscid="asd"
will load the fragment linked by the_href
into#asd
using a GET requesttarget="asd"
will also load the fragment linked by thecomponent
into#asd
using a GET requesttarget="asd"
will also load the fragment linked by thecallback
into#asd
using a POST request, with no POST variablesdelete="#qwe" noconfirm=True
will also delete#qwe
from the DOM once the callback completes successfully
- To create links that look like buttons, add
_class="btn"
(Bootstrap-specific)
LOAD and components ⌘
- Components are self-contained pieces of an application that can be placed into a DIV
- Components can contain submittable forms
- To create a component, use views with the
.load
extension
- Load components like this:
{{=LOAD(controller, function, ajax=True, ajax_trap=True)}}
ajax_trap=True
traps all links and forms inside the component, so that they only reload the component and not the whole pageajax=False
is the default, will run that other controller and paste the resulting HTML inlineajax=False
will request that other controller using AJAX. Impliesajax_trap=True
Deployment ⌘
- Proxying
- Apache with mod_wsgi
- nginx with uwsgi
- Not covered here - evolves too quickly
Deployment via proxying ⌘
- Create the
parameters_8000.py
file with the hashed admin password - Create a systemd service file for web2py.py
- Find out the correct flags for non-interactive use
- Most likely,
python2 web2py.py --nogui -p 8000 -a '<recycle>' -e
- Web2py will run on 127.0.0.1:8000
- Set up your web server to proxy the requests
- Let it set
X-Forwarded-For
andX-Forwarded-Proto
headers- Unset them in incoming requests!
Deployment via mod_wsgi ⌘
- Create the
parameters_8000.py
file with the hashed admin password - Copy
handlers/wsgihandler.py
to the same directory asweb2py.py
- Minimal example configuration:
# On Debian/Ubuntu, use "a2enmod wsgi" instead
LoadModule wsgi_module modules/mod_wsgi.so
WSGIDaemonProcess default processes=2 threads=15 user=... group=...
WSGIScriptAlias / /path/to/web2py/wsgihandler.py
WSGIProcessGroup default
<Directory /path/to/web2py>
Require all granted
</Directory>
Possible improvements ⌘
- More restrictive grant (so that there is no chance that the password, sqlite database, or private files are served)
<Directory /path/to/web2py>
Require all denied
<Files wsgihandler.py>
Require all granted
</Files>
</Directory>
- Serve static files via Apache
# On Debian/Ubuntu use "a2enmod expires" instead
LoadModule expires_module modules/mod_expires.so
AliasMatch ^/([^/]+)/static/(?:_[\d]+.[\d]+.[\d]+/)?(.*) /path/to/web2py/applications/$1/static/$2
<Directory /path/to/web2py/applications/*/static/>
Require all granted
# ONLY if you have enabled static versioning!
ExpiresActive On
ExpiresDefault
</Directory>