sections_fts
609 rows
This data as json, CSV (advanced)
Link | rowid ▼ | title | content | sections_fts | rank |
---|---|---|---|---|---|
1 | 1 | Pages and API endpoints | The Datasette web application offers a number of different pages that can be accessed to explore the data in question, each of which is accompanied by an equivalent JSON API. | 7 | |
2 | 2 | Top-level index | The root page of any Datasette installation is an index page that lists all of the currently attached databases. Some examples: fivethirtyeight.datasettes.com global-power-plants.datasettes.com register-of-members-interests.datasettes.com Add /.json to the end of the URL for the JSON version of the underlying data: fivethirtyeight.datasettes.com/.json global-power-plants.datasettes.com/.json register-of-members-interests.datasettes.com/.json The index page can also be accessed at /-/ , useful for if the default index page has been replaced using an index.html custom template . The /-/ page will always render the default Datasette index.html template. | 7 | |
3 | 3 | Database | Each database has a page listing the tables, views and canned queries available for that database. If the execute-sql permission is enabled (it's on by default) there will also be an interface for executing arbitrary SQL select queries against the data. Examples: fivethirtyeight.datasettes.com/fivethirtyeight global-power-plants.datasettes.com/global-power-plants The JSON version of this page provides programmatic access to the underlying data: fivethirtyeight.datasettes.com/fivethirtyeight.json global-power-plants.datasettes.com/global-power-plants.json | 7 | |
4 | 4 | Hidden tables | Some tables listed on the database page are treated as hidden. Hidden tables are not completely invisible - they can be accessed through the "hidden tables" link at the bottom of the page. They are hidden because they represent low-level implementation details which are generally not useful to end-users of Datasette. The following tables are hidden by default: Any table with a name that starts with an underscore - this is a Datasette convention to help plugins easily hide their own internal tables. Tables that have been configured as "hidden": true using Hiding tables . *_fts tables that implement SQLite full-text search indexes. Tables relating to the inner workings of the SpatiaLite SQLite extension. sqlite_stat tables used to store statistics used by the query optimizer. | 7 | |
5 | 5 | Queries | The /database-name/-/query page can be used to execute an arbitrary SQL query against that database, if the execute-sql permission is enabled. This query is passed as the ?sql= query string parameter. This means you can link directly to a query by constructing the following URL: /database-name/-/query?sql=SELECT+*+FROM+table_name Each configured canned query has its own page, at /database-name/query-name . Viewing this page will execute the query and display the results. In both cases adding a .json extension to the URL will return the results as JSON. | 7 | |
6 | 6 | Table | The table page is the heart of Datasette: it allows users to interactively explore the contents of a database table, including sorting, filtering, Full-text search and applying Facets . The HTML interface is worth spending some time exploring. As with other pages, you can return the JSON data by appending .json to the URL path, before any ? query string arguments. The query string arguments are described in more detail here: Table arguments You can also use the table page to interactively construct a SQL query - by applying different filters and a sort order for example - and then click the "View and edit SQL" link to see the SQL query that was used for the page and edit and re-submit it. Some examples: ../items lists all of the line-items registered by UK MPs as potential conflicts of interest. It demonstrates Datasette's support for Full-text search . ../antiquities-act%2Factions_under_antiquities_act is an interface for exploring the "actions under the antiquities act" data table published by FiveThirtyEight. ../global-power-plants?country_long=United+Kingdom&primary_fuel=Gas is a filtered table page showing every Gas power plant in the United Kingdom. It includes some default facets (configured using its metadata.json ) and uses the datasette-cluster-map plugin to show a map of the results. | 7 | |
7 | 7 | Row | Every row in every Datasette table has its own URL. This means individual records can be linked to directly. Table cells with extremely long text contents are truncated on the table view according to the truncate_cells_html setting. If a cell has been truncated the full length version of that cell will be available on the row page. Rows which are the targets of foreign key references from other tables will show a link to a filtered search for all records that reference that row. Here's an example from the Registers of Members Interests database: ../people/uk~2Eorg~2Epublicwhip~2Fperson~2F10001 Note that this URL includes the encoded primary key of the record. Here's that same page as JSON: ../people/uk~2Eorg~2Epublicwhip~2Fperson~2F10001.json | 7 | |
8 | 8 | Contributing | Datasette is an open source project. We welcome contributions! This document describes how to contribute to Datasette core. You can also contribute to the wider Datasette ecosystem by creating new Plugins . | 7 | |
9 | 9 | General guidelines | main should always be releasable . Incomplete features should live in branches. This ensures that any small bug fixes can be quickly released. The ideal commit should bundle together the implementation, unit tests and associated documentation updates. The commit message should link to an associated issue. New plugin hooks should only be shipped if accompanied by a separate release of a non-demo plugin that uses them. | 7 | |
10 | 10 | Setting up a development environment | If you have Python 3.8 or higher installed on your computer (on OS X the quickest way to do this is using homebrew ) you can install an editable copy of Datasette using the following steps. If you want to use GitHub to publish your changes, first create a fork of datasette under your own GitHub account. Now clone that repository somewhere on your computer: git clone git@github.com:YOURNAME/datasette If you want to get started without creating your own fork, you can do this instead: git clone git@github.com:simonw/datasette The next step is to create a virtual environment for your project and use it to install Datasette's dependencies: cd datasette # Create a virtual environment in ./venv python3 -m venv ./venv # Now activate the virtual environment, so pip can install into it source venv/bin/activate # Install Datasette and its testing dependencies python3 -m pip install -e '.[test]' That last line does most of the work: pip install -e means "install this package in a way that allows me to edit the source code in place". The .[test] option means "use the setup.py in this directory and install the optional testing dependencies as well". | 7 | |
11 | 11 | Running the tests | Once you have done this, you can run the Datasette unit tests from inside your datasette/ directory using pytest like so: pytest You can run the tests faster using multiple CPU cores with pytest-xdist like this: pytest -n auto -m "not serial" -n auto detects the number of available cores automatically. The -m "not serial" skips tests that don't work well in a parallel test environment. You can run those tests separately like so: pytest -m "serial" | 7 | |
12 | 12 | Using fixtures | To run Datasette itself, type datasette . You're going to need at least one SQLite database. A quick way to get started is to use the fixtures database that Datasette uses for its own tests. You can create a copy of that database by running this command: python tests/fixtures.py fixtures.db Now you can run Datasette against the new fixtures database like so: datasette fixtures.db This will start a server at http://127.0.0.1:8001/ . Any changes you make in the datasette/templates or datasette/static folder will be picked up immediately (though you may need to do a force-refresh in your browser to see changes to CSS or JavaScript). If you want to change Datasette's Python code you can use the --reload option to cause Datasette to automatically reload any time the underlying code changes: datasette --reload fixtures.db You can also use the fixtures.py script to recreate the testing version of metadata.json used by the unit tests. To do that: python tests/fixtures.py fixtures.db fixtures-metadata.json Or to output the plugins used by the tests, run this: python tests/fixtures.py fixtures.db fixtures-metadata.json fixtures-plugins Test tables written to fixtures.db - metadata written to fixtures-metadata.json Wrote plugin: fixtures-plugins/register_output_renderer.py Wrote plugin: fixtures-plugins/view_name.py Wrote plugin: fixtures-plugins/my_plugin.py Wrote plugin: fixtures-plugins/messages_output_renderer.py Wrote plugin: fixtures-plugins/my_plugin_2.py Then run Datasette like this: datasette fixtures.db -m fixtures-metadata.json --plugins-dir=fixtures-plugins/ | 7 | |
13 | 13 | Debugging | Any errors that occur while Datasette is running while display a stack trace on the console. You can tell Datasette to open an interactive pdb (or ipdb , if present) debugger session if an error occurs using the --pdb option: datasette --pdb fixtures.db For ipdb , first run this: datasette install ipdb | 7 | |
14 | 14 | Code formatting | Datasette uses opinionated code formatters: Black for Python and Prettier for JavaScript. These formatters are enforced by Datasette's continuous integration: if a commit includes Python or JavaScript code that does not match the style enforced by those tools, the tests will fail. When developing locally, you can verify and correct the formatting of your code using these tools. | 7 | |
15 | 15 | Running Black | Black will be installed when you run pip install -e '.[test]' . To test that your code complies with Black, run the following in your root datasette repository checkout: black . --check All done! ✨ 🍰 ✨ 95 files would be left unchanged. If any of your code does not conform to Black you can run this to automatically fix those problems: black . reformatted ../datasette/setup.py All done! ✨ 🍰 ✨ 1 file reformatted, 94 files left unchanged. | 7 | |
16 | 16 | blacken-docs | The blacken-docs command applies Black formatting rules to code examples in the documentation. Run it like this: blacken-docs -l 60 docs/*.rst | 7 | |
17 | 17 | Prettier | To install Prettier, install Node.js and then run the following in the root of your datasette repository checkout: npm install This will install Prettier in a node_modules directory. You can then check that your code matches the coding style like so: npm run prettier -- --check > prettier > prettier 'datasette/static/*[!.min].js' "--check" Checking formatting... [warn] datasette/static/plugins.js [warn] Code style issues found in the above file(s). Forgot to run Prettier? You can fix any problems by running: npm run fix | 7 | |
18 | 18 | Editing and building the documentation | Datasette's documentation lives in the docs/ directory and is deployed automatically using Read The Docs . The documentation is written using reStructuredText. You may find this article on The subset of reStructuredText worth committing to memory useful. You can build it locally by installing sphinx and sphinx_rtd_theme in your Datasette development environment and then running make html directly in the docs/ directory: # You may first need to activate your virtual environment: source venv/bin/activate # Install the dependencies needed to build the docs pip install -e .[docs] # Now build the docs cd docs/ make html This will create the HTML version of the documentation in docs/_build/html . You can open it in your browser like so: open _build/html/index.html Any time you make changes to a .rst file you can re-run make html to update the built documents, then refresh them in your browser. For added productivity, you can use use sphinx-autobuild to run Sphinx in auto-build mode. This will run a local webserver serving the docs that automatically rebuilds them and refreshes the page any time you hit save in your editor. sphinx-autobuild will have been installed when you ran pip install -e .[docs] . In your docs/ directory you can start the server by running the following: make livehtml Now browse to http://localhost:8000/ to view the documentation. Any edits you make should be instantly reflected in your browser. | 7 | |
19 | 19 | Running Cog | Some pages of documentation (in particular the CLI reference ) are automatically updated using Cog . To update these pages, run the following command: cog -r docs/*.rst | 7 | |
20 | 20 | Continuously deployed demo instances | The demo instance at latest.datasette.io is re-deployed automatically to Google Cloud Run for every push to main that passes the test suite. This is implemented by the GitHub Actions workflow at .github/workflows/deploy-latest.yml . Specific branches can also be set to automatically deploy by adding them to the on: push: branches block at the top of the workflow YAML file. Branches configured in this way will be deployed to a new Cloud Run service whether or not their tests pass. The Cloud Run URL for a branch demo can be found in the GitHub Actions logs. | 7 | |
21 | 21 | Release process | Datasette releases are performed using tags. When a new release is published on GitHub, a GitHub Action workflow will perform the following: Run the unit tests against all supported Python versions. If the tests pass... Build a Docker image of the release and push a tag to https://hub.docker.com/r/datasetteproject/datasette Re-point the "latest" tag on Docker Hub to the new image Build a wheel bundle of the underlying Python source code Push that new wheel up to PyPI: https://pypi.org/project/datasette/ If the release is an alpha, navigate to https://readthedocs.org/projects/datasette/versions/ and search for the tag name in the "Activate a version" filter, then mark that version as "active" to ensure it will appear on the public ReadTheDocs documentation site. To deploy new releases you will need to have push access to the main Datasette GitHub repository. Datasette follows Semantic Versioning : major.minor.patch We increment major for backwards-incompatible releases. Datasette is currently pre-1.0 so the major version is always 0 . We increment minor for new features. We increment patch for bugfix releass. Alpha and beta releases may have an additional a0 or b0 prefix - the integer component will be incremented with each subsequent alpha or beta. To release a new version, first create a commit that updates the version number in datasette/version.py and the the changelog with highlights of the new version. An example commit can be seen here : # Update changelog git commit -m " Release 0.51a1 Ref… | 7 | |
22 | 22 | Alpha and beta releases | Alpha and beta releases are published to preview upcoming features that may not yet be stable - in particular to preview new plugin hooks. You are welcome to try these out, but please be aware that details may change before the final release. Please join discussions on the issue tracker to share your thoughts and experiences with on alpha and beta features that you try out. | 7 | |
23 | 23 | Releasing bug fixes from a branch | If it's necessary to publish a bug fix release without shipping new features that have landed on main a release branch can be used. Create it from the relevant last tagged release like so: git branch 0.52.x 0.52.4 git checkout 0.52.x Next cherry-pick the commits containing the bug fixes: git cherry-pick COMMIT Write the release notes in the branch, and update the version number in version.py . Then push the branch: git push -u origin 0.52.x Once the tests have completed, publish the release from that branch target using the GitHub Draft a new release form. Finally, cherry-pick the commit with the release notes and version number bump across to main : git checkout main git cherry-pick COMMIT git push | 7 | |
24 | 24 | Upgrading CodeMirror | Datasette bundles CodeMirror for the SQL editing interface, e.g. on this page . Here are the steps for upgrading to a new version of CodeMirror: Install the packages with: npm i codemirror @codemirror/lang-sql Build the bundle using the version number from package.json with: node_modules/.bin/rollup datasette/static/cm-editor-6.0.1.js \ -f iife \ -n cm \ -o datasette/static/cm-editor-6.0.1.bundle.js \ -p @rollup/plugin-node-resolve \ -p @rollup/plugin-terser Update the version reference in the codemirror.html template. | 7 | |
25 | 25 | Internals for plugins | Many Plugin hooks are passed objects that provide access to internal Datasette functionality. The interface to these objects should not be considered stable with the exception of methods that are documented here. | 7 | |
26 | 26 | Request object | The request object is passed to various plugin hooks. It represents an incoming HTTP request. It has the following properties: .scope - dictionary The ASGI scope that was used to construct this request, described in the ASGI HTTP connection scope specification. .method - string The HTTP method for this request, usually GET or POST . .url - string The full URL for this request, e.g. https://latest.datasette.io/fixtures . .scheme - string The request scheme - usually https or http . .headers - dictionary (str -> str) A dictionary of incoming HTTP request headers. Header names have been converted to lowercase. .cookies - dictionary (str -> str) A dictionary of incoming cookies .host - string The host header from the incoming request, e.g. latest.datasette.io or localhost . .path - string The path of the request excluding the query string, e.g. /fixtures . .full_path - string The path of the… | 7 | |
27 | 27 | The MultiParams class | request.args is a MultiParams object - a dictionary-like object which provides access to query string parameters that may have multiple values. Consider the query string ?foo=1&foo=2&bar=3 - with two values for foo and one value for bar . request.args[key] - string Returns the first value for that key, or raises a KeyError if the key is missing. For the above example request.args["foo"] would return "1" . request.args.get(key) - string or None Returns the first value for that key, or None if the key is missing. Pass a second argument to specify a different default, e.g. q = request.args.get("q", "") . request.args.getlist(key) - list of strings Returns the list of strings for that key. request.args.getlist("foo") would return ["1", "2"] in the above example. request.args.getlist("bar") would return ["3"] . If the key is missing an empty list will be returned. request.args.keys() - list of strings Returns the list of available keys - for the example this would be ["foo", "bar"] . key in request.args - True or False You can use if key in request.args to check if a key is present. for key in request.args - iterator This lets you loop through every available key. le… | 7 | |
28 | 28 | Response class | The Response class can be returned from view functions that have been registered using the register_routes(datasette) hook. The Response() constructor takes the following arguments: body - string The body of the response. status - integer (optional) The HTTP status - defaults to 200. headers - dictionary (optional) A dictionary of extra HTTP headers, e.g. {"x-hello": "world"} . content_type - string (optional) The content-type for the response. Defaults to text/plain . For example: from datasette.utils.asgi import Response response = Response( "<xml>This is XML</xml>", content_type="application/xml; charset=utf-8", ) The quickest way to create responses is using the Response.text(...) , Response.html(...) , Response.json(...) or Response.redirect(...) helper methods: from datasette.utils.asgi import Response html_response = Response.html("This is HTML") json_response = Response.json({"this_is": "json"}) text_response = Response.text( "This will become utf-8 encoded text" ) # Redirects are served as 302, unless you pass status=301: redirect_response = Response.redirect( "https://latest.datasette.io/" ) Each of these responses will use the correct corresponding content-type - text/html; charset=utf-8 , application/json; charset=utf-8 or text/plain; charset=utf-8 respectively. Each of the helper methods take optional status= and headers= argument… | 7 | |
29 | 29 | Returning a response with .asgi_send(send) | In most cases you will return Response objects from your own view functions. You can also use a Response instance to respond at a lower level via ASGI, for example if you are writing code that uses the asgi_wrapper(datasette) hook. Create a Response object and then use await response.asgi_send(send) , passing the ASGI send function. For example: async def require_authorization(scope, receive, send): response = Response.text( "401 Authorization Required", headers={ "www-authenticate": 'Basic realm="Datasette", charset="UTF-8"' }, status=401, ) await response.asgi_send(send) | 7 | |
30 | 30 | Setting cookies with response.set_cookie() | To set cookies on the response, use the response.set_cookie(...) method. The method signature looks like this: def set_cookie( self, key, value="", max_age=None, expires=None, path="/", domain=None, secure=False, httponly=False, samesite="lax", ): ... You can use this with datasette.sign() to set signed cookies. Here's how you would set the ds_actor cookie for use with Datasette authentication : response = Response.redirect("/") response.set_cookie( "ds_actor", datasette.sign({"a": {"id": "cleopaws"}}, "actor"), ) return response | 7 | |
31 | 31 | Datasette class | This object is an instance of the Datasette class, passed to many plugin hooks as an argument called datasette . You can create your own instance of this - for example to help write tests for a plugin - like so: from datasette.app import Datasette # With no arguments a single in-memory database will be attached datasette = Datasette() # The files= argument can load files from disk datasette = Datasette(files=["/path/to/my-database.db"]) # Pass metadata as a JSON dictionary like this datasette = Datasette( files=["/path/to/my-database.db"], metadata={ "databases": { "my-database": { "description": "This is my database" } } }, ) Constructor parameters include: files=[...] - a list of database files to open immutables=[...] - a list of database files to open in immutable mode metadata={...} - a dictionary of Metadata config_dir=... - the configuration directory to use, stored in datasette.config_dir | 7 | |
32 | 32 | .databases | Property exposing a collections.OrderedDict of databases currently connected to Datasette. The dictionary keys are the name of the database that is used in the URL - e.g. /fixtures would have a key of "fixtures" . The values are Database class instances. All databases are listed, irrespective of user permissions. | 7 | |
33 | 33 | .permissions | Property exposing a dictionary of permissions that have been registered using the register_permissions(datasette) plugin hook. The dictionary keys are the permission names - e.g. view-instance - and the values are Permission() objects describing the permission. Here is a description of that object . | 7 | |
34 | 34 | .plugin_config(plugin_name, database=None, table=None) | plugin_name - string The name of the plugin to look up configuration for. Usually this is something similar to datasette-cluster-map . database - None or string The database the user is interacting with. table - None or string The table the user is interacting with. This method lets you read plugin configuration values that were set in datasette.yaml . See Writing plugins that accept configuration for full details of how this method should be used. The return value will be the value from the configuration file - usually a dictionary. If the plugin is not configured the return value will be None . | 7 | |
35 | 35 | await .render_template(template, context=None, request=None) | template - string, list of strings or jinja2.Template The template file to be rendered, e.g. my_plugin.html . Datasette will search for this file first in the --template-dir= location, if it was specified - then in the plugin's bundled templates and finally in Datasette's set of default templates. If this is a list of template file names then the first one that exists will be loaded and rendered. If this is a Jinja Template object it will be used directly. context - None or a Python dictionary The context variables to pass to the template. request - request object or None If you pass a Datasette request object here it will be made available to the template. Renders a Jinja template using Datasette's preconfigured instance of Jinja and returns the resulting string. The template will have access to Datasette's default template functions and any functions that have been made available by other plugins. | 7 | |
36 | 36 | await .actors_from_ids(actor_ids) | actor_ids - list of strings or integers A list of actor IDs to look up. Returns a dictionary, where the keys are the IDs passed to it and the values are the corresponding actor dictionaries. This method is mainly designed to be used with plugins. See the actors_from_ids(datasette, actor_ids) documentation for details. If no plugins that implement that hook are installed, the default return value looks like this: { "1": {"id": "1"}, "2": {"id": "2"} } | 7 | |
37 | 37 | await .permission_allowed(actor, action, resource=None, default=...) | actor - dictionary The authenticated actor. This is usually request.actor . action - string The name of the action that is being permission checked. resource - string or tuple, optional The resource, e.g. the name of the database, or a tuple of two strings containing the name of the database and the name of the table. Only some permissions apply to a resource. default - optional: True, False or None What value should be returned by default if nothing provides an opinion on this permission check. Set to True for default allow or False for default deny. If not specified the default from the Permission() tuple that was registered using register_permissions(datasette) will be used. Check if the given actor has permission to perform the given action on the given resource. Some permission checks are carried out against rules defined in datasette.yaml , while other custom permissions may be decided by plugins that implement the permission_allowed(datasette, actor, action, resource) plugin hook. If neither metadata.json nor any of the plugins provide an answer to the permission query the default argument will be returned. See Built-in permissions for a full list of permission actions included in Datasette core. | 7 | |
38 | 38 | await .ensure_permissions(actor, permissions) | actor - dictionary The authenticated actor. This is usually request.actor . permissions - list A list of permissions to check. Each permission in that list can be a string action name or a 2-tuple of (action, resource) . This method allows multiple permissions to be checked at once. It raises a datasette.Forbidden exception if any of the checks are denied before one of them is explicitly granted. This is useful when you need to check multiple permissions at once. For example, an actor should be able to view a table if either one of the following checks returns True or not a single one of them returns False : await datasette.ensure_permissions( request.actor, [ ("view-table", (database, table)), ("view-database", database), "view-instance", ], ) | 7 | |
39 | 39 | await .check_visibility(actor, action=None, resource=None, permissions=None) | actor - dictionary The authenticated actor. This is usually request.actor . action - string, optional The name of the action that is being permission checked. resource - string or tuple, optional The resource, e.g. the name of the database, or a tuple of two strings containing the name of the database and the name of the table. Only some permissions apply to a resource. permissions - list of action strings or (action, resource) tuples, optional Provide this instead of action and resource to check multiple permissions at once. This convenience method can be used to answer the question "should this item be considered private, in that it is visible to me but it is not visible to anonymous users?" It returns a tuple of two booleans, (visible, private) . visible indicates if the actor can see this resource. private will be True if an anonymous user would not be able to view the resource. This example checks if the user can access a specific table, and sets private so that a padlock icon can later be displayed: visible, private = await datasette.check_visibility( request.actor, action="view-table", resource=(database, table), ) The following example runs three checks in a row, similar to await .ensure_permissions(actor, permissions) . If any of the checks are denied before one of them is explicitly granted then visible will be … | 7 | |
40 | 40 | .create_token(actor_id, expires_after=None, restrict_all=None, restrict_database=None, restrict_resource=None) | actor_id - string The ID of the actor to create a token for. expires_after - int, optional The number of seconds after which the token should expire. restrict_all - iterable, optional A list of actions that this token should be restricted to across all databases and resources. restrict_database - dict, optional For restricting actions within specific databases, e.g. {"mydb": ["view-table", "view-query"]} . restrict_resource - dict, optional For restricting actions to specific resources (tables, SQL views and Canned queries ) within a database. For example: {"mydb": {"mytable": ["insert-row", "update-row"]}} . This method returns a signed API token of the format dstok_... which can be used to authenticate requests to the Datasette API. All tokens must have an actor_id string indicating the ID of the actor which the token will act on behalf of. Tokens default to lasting forever, but can be set to expire after a given number of seconds using the expires_after argument. The following code creates a token for user1 that will expire after an hour: token = datasette.create_token( actor_id="user1", expires_after=3600, ) The three restrict_* arguments can be used to create a token that has … | 7 | |
41 | 41 | .get_permission(name_or_abbr) | name_or_abbr - string The name or abbreviation of the permission to look up, e.g. view-table or vt . Returns a Permission object representing the permission, or raises a KeyError if one is not found. | 7 | |
42 | 42 | .get_database(name) | name - string, optional The name of the database - optional. Returns the specified database object. Raises a KeyError if the database does not exist. Call this method without an argument to return the first connected database. | 7 | |
43 | 43 | .get_internal_database() | Returns a database object for reading and writing to the private internal database . | 7 | |
44 | 44 | Getting and setting metadata | Metadata about the instance, databases, tables and columns is stored in tables in Datasette's internal database . The following methods are the supported API for plugins to read and update that stored metadata. | 7 | |
45 | 45 | await .get_instance_metadata(self) | Returns metadata keys and values for the entire Datasette instance as a dictionary. Internally queries the metadata_instance table inside the internal database . | 7 | |
46 | 46 | await .get_database_metadata(self, database_name) | database_name - string The name of the database to query. Returns metadata keys and values for the specified database as a dictionary. Internally queries the metadata_databases table inside the internal database . | 7 | |
47 | 47 | await .get_resource_metadata(self, database_name, resource_name) | database_name - string The name of the database to query. resource_name - string The name of the resource (table, view, or canned query) inside database_name to query. Returns metadata keys and values for the specified "resource" as a dictionary. A "resource" in this context can be a table, view, or canned query. Internally queries the metadata_resources table inside the internal database . | 7 | |
48 | 48 | await .get_column_metadata(self, database_name, resource_name, column_name) | database_name - string The name of the database to query. resource_name - string The name of the resource (table, view, or canned query) inside database_name to query. column_name - string The name of the column inside resource_name to query. Returns metadata keys and values for the specified column, resource, and table as a dictionary. Internally queries the metadata_columns table inside the internal database . | 7 | |
49 | 49 | await .set_instance_metadata(self, key, value) | key - string The metadata entry key to insert (ex title , description , etc.) value - string The value of the metadata entry to insert. Adds a new metadata entry for the entire Datasette instance. Any previous instance-level metadata entry with the same key will be overwritten. Internally upserts the value into the the metadata_instance table inside the internal database . | 7 | |
50 | 50 | await .set_database_metadata(self, database_name, key, value) | database_name - string The database the metadata entry belongs to. key - string The metadata entry key to insert (ex title , description , etc.) value - string The value of the metadata entry to insert. Adds a new metadata entry for the specified database. Any previous database-level metadata entry with the same key will be overwritten. Internally upserts the value into the the metadata_databases table inside the internal database . | 7 | |
51 | 51 | await .set_resource_metadata(self, database_name, resource_name, key, value) | database_name - string The database the metadata entry belongs to. resource_name - string The resource (table, view, or canned query) the metadata entry belongs to. key - string The metadata entry key to insert (ex title , description , etc.) value - string The value of the metadata entry to insert. Adds a new metadata entry for the specified "resource". Any previous resource-level metadata entry with the same key will be overwritten. Internally upserts the value into the the metadata_resources table inside the internal database . | 7 | |
52 | 52 | await .set_column_metadata(self, database_name, resource_name, column_name, key, value) | database_name - string The database the metadata entry belongs to. resource_name - string The resource (table, view, or canned query) the metadata entry belongs to. column-name - string The column the metadata entry belongs to. key - string The metadata entry key to insert (ex title , description , etc.) value - string The value of the metadata entry to insert. Adds a new metadata entry for the specified column. Any previous column-level metadata entry with the same key will be overwritten. Internally upserts the value into the the metadata_columns table inside the internal database . | 7 | |
53 | 53 | .add_database(db, name=None, route=None) | db - datasette.database.Database instance The database to be attached. name - string, optional The name to be used for this database . If not specified Datasette will pick one based on the filename or memory name. route - string, optional This will be used in the URL path. If not specified, it will default to the same thing as the name . The datasette.add_database(db) method lets you add a new database to the current Datasette instance. The db parameter should be an instance of the datasette.database.Database class. For example: from datasette.database import Database datasette.add_database( Database( datasette, path="path/to/my-new-database.db", ) ) This will add a mutable database and serve it at /my-new-database . Use is_mutable=False to add an immutable database. .add_database() returns the Database instance, with its name set as the database.name attribute. Any time you are working with a newly added database you should use the return value of .add_database() , for example: db = datasette.add_database( Database(datasette, memory_name="statistics") ) await db.execute_write( "CREATE TABLE foo(id integer primary key)" ) | 7 | |
54 | 54 | .add_memory_database(name) | Adds a shared in-memory database with the specified name: datasette.add_memory_database("statistics") This is a shortcut for the following: from datasette.database import Database datasette.add_database( Database(datasette, memory_name="statistics") ) Using either of these pattern will result in the in-memory database being served at /statistics . | 7 | |
55 | 55 | .remove_database(name) | name - string The name of the database to be removed. This removes a database that has been previously added. name= is the unique name of that database. | 7 | |
56 | 56 | await .track_event(event) | event - Event An instance of a subclass of datasette.events.Event . Plugins can call this to track events, using classes they have previously registered. See Event tracking for details. The event will then be passed to all plugins that have registered to receive events using the track_event(datasette, event) hook. Example usage, assuming the plugin has previously registered the BanUserEvent class: await datasette.track_event( BanUserEvent(user={"id": 1, "username": "cleverbot"}) ) | 7 | |
57 | 57 | .sign(value, namespace="default") | value - any serializable type The value to be signed. namespace - string, optional An alternative namespace, see the itsdangerous salt documentation . Utility method for signing values, such that you can safely pass data to and from an untrusted environment. This is a wrapper around the itsdangerous library. This method returns a signed string, which can be decoded and verified using .unsign(value, namespace="default") . | 7 | |
58 | 58 | .unsign(value, namespace="default") | signed - any serializable type The signed string that was created using .sign(value, namespace="default") . namespace - string, optional The alternative namespace, if one was used. Returns the original, decoded object that was passed to .sign(value, namespace="default") . If the signature is not valid this raises a itsdangerous.BadSignature exception. | 7 | |
59 | 59 | .add_message(request, message, type=datasette.INFO) | request - Request The current Request object message - string The message string type - constant, optional The message type - datasette.INFO , datasette.WARNING or datasette.ERROR Datasette's flash messaging mechanism allows you to add a message that will be displayed to the user on the next page that they visit. Messages are persisted in a ds_messages cookie. This method adds a message to that cookie. You can try out these messages (including the different visual styling of the three message types) using the /-/messages debugging tool. | 7 | |
60 | 60 | .absolute_url(request, path) | request - Request The current Request object path - string A path, for example /dbname/table.json Returns the absolute URL for the given path, including the protocol and host. For example: absolute_url = datasette.absolute_url( request, "/dbname/table.json" ) # Would return "http://localhost:8001/dbname/table.json" The current request object is used to determine the hostname and protocol that should be used for the returned URL. The force_https_urls configuration setting is taken into account. | 7 | |
61 | 61 | .setting(key) | key - string The name of the setting, e.g. base_url . Returns the configured value for the specified setting . This can be a string, boolean or integer depending on the requested setting. For example: downloads_are_allowed = datasette.setting("allow_download") | 7 | |
62 | 62 | .resolve_database(request) | request - Request object A request object If you are implementing your own custom views, you may need to resolve the database that the user is requesting based on a URL path. If the regular expression for your route declares a database named group, you can use this method to resolve the database object. This returns a Database instance. If the database cannot be found, it raises a datasette.utils.asgi.DatabaseNotFound exception - which is a subclass of datasette.utils.asgi.NotFound with a .database_name attribute set to the name of the database that was requested. | 7 | |
63 | 63 | .resolve_table(request) | request - Request object A request object This assumes that the regular expression for your route declares both a database and a table named group. It returns a ResolvedTable named tuple instance with the following fields: db - Database The database object table - string The name of the table (or view) is_view - boolean True if this is a view, False if it is a table If the database or table cannot be found it raises a datasette.utils.asgi.DatabaseNotFound exception. If the table does not exist it raises a datasette.utils.asgi.TableNotFound exception - a subclass of datasette.utils.asgi.NotFound with .database_name and .table attributes. | 7 | |
64 | 64 | .resolve_row(request) | request - Request object A request object This method assumes your route declares named groups for database , table and pks . It returns a ResolvedRow named tuple instance with the following fields: db - Database The database object table - string The name of the table sql - string SQL snippet that can be used in a WHERE clause to select the row params - dict Parameters that should be passed to the SQL query pks - list List of primary key column names pk_values - list List of primary key values decoded from the URL row - sqlite3.Row The row itself If the database or table cannot be found it raises a datasette.utils.asgi.DatabaseNotFound exception. If the table does not exist it raises a datasette.utils.asgi.TableNotFound … | 7 | |
65 | 65 | datasette.client | Plugins can make internal simulated HTTP requests to the Datasette instance within which they are running. This ensures that all of Datasette's external JSON APIs are also available to plugins, while avoiding the overhead of making an external HTTP call to access those APIs. The datasette.client object is a wrapper around the HTTPX Python library , providing an async-friendly API that is similar to the widely used Requests library . It offers the following methods: await datasette.client.get(path, **kwargs) - returns HTTPX Response Execute an internal GET request against that path. await datasette.client.post(path, **kwargs) - returns HTTPX Response Execute an internal POST request. Use data={"name": "value"} to pass form parameters. await datasette.client.options(path, **kwargs) - returns HTTPX Response Execute an internal OPTIONS request. await datasette.client.head(path, **kwargs) - returns HTTPX Response Execute an internal HEAD request. await datasette.client.put(path, **kwargs) - returns HTTPX Response Execute an internal PUT request. await datasette.client.patch(path, **kwargs) - returns HTTPX Response … | 7 | |
66 | 66 | datasette.urls | The datasette.urls object contains methods for building URLs to pages within Datasette. Plugins should use this to link to pages, since these methods take into account any base_url configuration setting that might be in effect. datasette.urls.instance(format=None) Returns the URL to the Datasette instance root page. This is usually "/" . datasette.urls.path(path, format=None) Takes a path and returns the full path, taking base_url into account. For example, datasette.urls.path("-/logout") will return the path to the logout page, which will be "/-/logout" by default or /prefix-path/-/logout if base_url is set to /prefix-path/ datasette.urls.logout() Returns the URL to the logout page, usually "/-/logout" datasette.urls.static(path) Returns the URL of one of Datasette's default static assets, for example "/-/static/app.css" datasette.urls.static_plugins(plugin_name, path) Returns the URL of one of the static assets belonging to a plugin. datasette.urls.static_plugins("datasette_cluster_map", "datasette-cluster-map.js") would return "/-/static-plugins/datasette_cluster_map/datasette-cluster-map.js" datasette.urls.static(path) … | 7 | |
67 | 67 | Database class | Instances of the Database class can be used to execute queries against attached SQLite databases, and to run introspection against their schemas. | 7 | |
68 | 68 | Database(ds, path=None, is_mutable=True, is_memory=False, memory_name=None) | The Database() constructor can be used by plugins, in conjunction with .add_database(db, name=None, route=None) , to create and register new databases. The arguments are as follows: ds - Datasette class (required) The Datasette instance you are attaching this database to. path - string Path to a SQLite database file on disk. is_mutable - boolean Set this to False to cause Datasette to open the file in immutable mode. is_memory - boolean Use this to create non-shared memory connections. memory_name - string or None Use this to create a named in-memory database. Unlike regular memory databases these can be accessed by multiple threads and will persist an changes made to them for the lifetime of the Datasette server process. The first argument is the datasette instance you are attaching to, the second is a path= , then is_mutable and is_memory are both optional arguments. | 7 | |
69 | 69 | db.hash | If the database was opened in immutable mode, this property returns the 64 character SHA-256 hash of the database contents as a string. Otherwise it returns None . | 7 | |
70 | 70 | await db.execute(sql, ...) | Executes a SQL query against the database and returns the resulting rows (see Results ). sql - string (required) The SQL query to execute. This can include ? or :named parameters. params - list or dict A list or dictionary of values to use for the parameters. List for ? , dictionary for :named . truncate - boolean Should the rows returned by the query be truncated at the maximum page size? Defaults to True , set this to False to disable truncation. custom_time_limit - integer ms A custom time limit for this query. This can be set to a lower value than the Datasette configured default. If a query takes longer than this it will be terminated early and raise a dataette.database.QueryInterrupted exception. page_size - integer Set a custom page size for truncation, over-riding the configured Datasette default. log_sql_errors - boolean Should any SQL errors be logged to the console in addition to being raised as an error? Defaults to True . | 7 | |
71 | 71 | Results | The db.execute() method returns a single Results object. This can be used to access the rows returned by the query. Iterating over a Results object will yield SQLite Row objects . Each of these can be treated as a tuple or can be accessed using row["column"] syntax: info = [] results = await db.execute("select name from sqlite_master") for row in results: info.append(row["name"]) The Results object also has the following properties and methods: .truncated - boolean Indicates if this query was truncated - if it returned more results than the specified page_size . If this is true then the results object will only provide access to the first page_size rows in the query result. You can disable truncation by passing truncate=False to the db.query() method. .columns - list of strings A list of column names returned by the query. .rows - list of sqlite3.Row This property provides direct access to the list of rows returned by the database. You can access specific rows by index using results.rows[0] . .dicts() - list of dict This method returns a list of Python dictionaries, one for each row. .first() - row or None Returns the first row in the results, or None if no rows were returned. … | 7 | |
72 | 72 | await db.execute_fn(fn) | Executes a given callback function against a read-only database connection running in a thread. The function will be passed a SQLite connection, and the return value from the function will be returned by the await . Example usage: def get_version(conn): return conn.execute( "select sqlite_version()" ).fetchall()[0][0] version = await db.execute_fn(get_version) | 7 | |
73 | 73 | await db.execute_write(sql, params=None, block=True) | SQLite only allows one database connection to write at a time. Datasette handles this for you by maintaining a queue of writes to be executed against a given database. Plugins can submit write operations to this queue and they will be executed in the order in which they are received. This method can be used to queue up a non-SELECT SQL query to be executed against a single write connection to the database. You can pass additional SQL parameters as a tuple or dictionary. The method will block until the operation is completed, and the return value will be the return from calling conn.execute(...) using the underlying sqlite3 Python library. If you pass block=False this behavior changes to "fire and forget" - queries will be added to the write queue and executed in a separate thread while your code can continue to do other things. The method will return a UUID representing the queued task. Each call to execute_write() will be executed inside a transaction. | 7 | |
74 | 74 | await db.execute_write_script(sql, block=True) | Like execute_write() but can be used to send multiple SQL statements in a single string separated by semicolons, using the sqlite3 conn.executescript() method. Each call to execute_write_script() will be executed inside a transaction. | 7 | |
75 | 75 | await db.execute_write_many(sql, params_seq, block=True) | Like execute_write() but uses the sqlite3 conn.executemany() method. This will efficiently execute the same SQL statement against each of the parameters in the params_seq iterator, for example: await db.execute_write_many( "insert into characters (id, name) values (?, ?)", [(1, "Melanie"), (2, "Selma"), (2, "Viktor")], ) Each call to execute_write_many() will be executed inside a transaction. | 7 | |
76 | 76 | await db.execute_write_fn(fn, block=True, transaction=True) | This method works like .execute_write() , but instead of a SQL statement you give it a callable Python function. Your function will be queued up and then called when the write connection is available, passing that connection as the argument to the function. The function can then perform multiple actions, safe in the knowledge that it has exclusive access to the single writable connection for as long as it is executing. fn needs to be a regular function, not an async def function. For example: def delete_and_return_count(conn): conn.execute("delete from some_table where id > 5") return conn.execute( "select count(*) from some_table" ).fetchone()[0] try: num_rows_left = await database.execute_write_fn( delete_and_return_count ) except Exception as e: print("An error occurred:", e) The value returned from await database.execute_write_fn(...) will be the return value from your function. If your function raises an exception that exception will be propagated up to the await line. By default your function will be executed inside a transaction. You can pass transaction=False to disable this behavior, though if you do that you should be careful to manually apply transactions - ideally using the with conn: pattern, or you may see OperationalError: database table is locked errors. If you specify block=False the method becomes fire-and-forget, queueing your function to be executed and then allowing your code after the call to .execute_write_fn() to continue running while the underlying thread waits for an opportunity to run your function. A UUID representing the queued task will be returned. Any exceptions in your code will be silently swallowed. | 7 | |
77 | 77 | await db.execute_isolated_fn(fn) | This method works is similar to execute_write_fn() but executes the provided function in an entirely isolated SQLite connection, which is opened, used and then closed again in a single call to this method. The prepare_connection() plugin hook is not executed against this connection. This allows plugins to execute database operations that might conflict with how database connections are usually configured. For example, running a VACUUM operation while bypassing any restrictions placed by the datasette-sqlite-authorizer plugin. Plugins can also use this method to load potentially dangerous SQLite extensions, use them to perform an operation and then have them safely unloaded at the end of the call, without risk of exposing them to other connections. Functions run using execute_isolated_fn() share the same queue as execute_write_fn() , which guarantees that no writes can be executed at the same time as the isolated function is executing. The return value of the function will be returned by this method. Any exceptions raised by the function will be raised out of the await line as well. | 7 | |
78 | 78 | db.close() | Closes all of the open connections to file-backed databases. This is mainly intended to be used by large test suites, to avoid hitting limits on the number of open files. | 7 | |
79 | 79 | Database introspection | The Database class also provides properties and methods for introspecting the database. db.name - string The name of the database - usually the filename without the .db prefix. db.size - integer The size of the database file in bytes. 0 for :memory: databases. db.mtime_ns - integer or None The last modification time of the database file in nanoseconds since the epoch. None for :memory: databases. db.is_mutable - boolean Is this database mutable, and allowed to accept writes? db.is_memory - boolean Is this database an in-memory database? await db.attached_databases() - list of named tuples Returns a list of additional databases that have been connected to this database using the SQLite ATTACH command. Each named tuple has fields seq , name and file . await db.table_exists(table) - boolean Check if a table called table exists. await db.view_exists(view) - boolean … | 7 | |
80 | 80 | CSRF protection | Datasette uses asgi-csrf to guard against CSRF attacks on form POST submissions. Users receive a ds_csrftoken cookie which is compared against the csrftoken form field (or x-csrftoken HTTP header) for every incoming request. If your plugin implements a <form method="POST"> anywhere you will need to include that token. You can do so with the following template snippet: <input type="hidden" name="csrftoken" value="{{ csrftoken() }}"> If you are rendering templates using the await .render_template(template, context=None, request=None) method the csrftoken() helper will only work if you provide the request= argument to that method. If you forget to do this you will see the following error: form-urlencoded POST field did not match cookie You can selectively disable CSRF protection using the skip_csrf(datasette, scope) hook. | 7 | |
81 | 81 | Datasette's internal database | Datasette maintains an "internal" SQLite database used for configuration, caching, and storage. Plugins can store configuration, settings, and other data inside this database. By default, Datasette will use a temporary in-memory SQLite database as the internal database, which is created at startup and destroyed at shutdown. Users of Datasette can optionally pass in a --internal flag to specify the path to a SQLite database to use as the internal database, which will persist internal data across Datasette instances. Datasette maintains tables called catalog_databases , catalog_tables , catalog_columns , catalog_indexes , catalog_foreign_keys with details of the attached databases and their schemas. These tables should not be considered a stable API - they may change between Datasette releases. Metadata is stored in tables metadata_instance , metadata_databases , metadata_resources and metadata_columns . Plugins can interact with these tables via the get_*_metadata() and set_*_metadata() methods . The internal database is not exposed in the Datasette application by default, which means private data can safely be stored without worry of accidentally leaking information through the default Datasette interface and API. However, other plugins do have full read and write access to the internal database. Plugins can access this database by calling internal_db = datasette.get_internal_database() and then executing queries using the Database API . Plugin authors are asked to practice good etiquette when using the internal database, as all plugins use the same database to store data. For example: Use a unique prefix when creating tables, indices, and triggers in the internal database. If your plugin is called datasette-xyz , then prefix names with datasette_xyz_* . Avoid long-running write statements that may stall or block ot… | 7 | |
82 | 82 | Internal database schema | The internal database schema is as follows: [[[cog from metadata_doc import internal_schema internal_schema(cog) ]]] CREATE TABLE catalog_databases ( database_name TEXT PRIMARY KEY, path TEXT, is_memory INTEGER, schema_version INTEGER ); CREATE TABLE catalog_tables ( database_name TEXT, table_name TEXT, rootpage INTEGER, sql TEXT, PRIMARY KEY (database_name, table_name), FOREIGN KEY (database_name) REFERENCES databases(database_name) ); CREATE TABLE catalog_columns ( database_name TEXT, table_name TEXT, cid INTEGER, name TEXT, type TEXT, "notnull" INTEGER, default_value TEXT, -- renamed from dflt_value is_pk INTEGER, -- renamed from pk hidden INTEGER, PRIMARY KEY (database_name, table_name, name), FOREIGN KEY (database_name) REFERENCES databases(database_name), FOREIGN KEY (database_name, table_name) REFERENCES tables(database_name, table_name) ); CREATE TABLE catalog_indexes ( database_name TEXT, table_name TEXT, seq INTEGER, name TEXT, "unique" INTEGER, origin TEXT, partial INTEGER, PRIMARY KEY (database_name, table_name, name), FOREIGN KEY (database_name) REFERENCES databases(database_name), FOREIGN KEY (database_name, table_name) REFERENCES tables(database_name, table_name) ); CREATE TABLE catalog_foreign_keys ( database_name TEXT, table_name TEXT, id INTEGER, seq INTEGER, "table" TEXT, "from" TEXT, "to" TEXT, on_update TEXT, on_delete TEXT, match TEXT, PRIMARY KEY (database_name, table_name, id, seq), FOREIGN KEY (database_name) REFERENCES databases(database_name), FOREIGN KEY (database_name, table_name) REFERENCES tables(database_name, table_name) ); CREATE TABLE metadata_instance ( key text, value text, unique(key) ); CREATE TABLE metadata_databases ( database_name text, key text, value text, unique(database_name, key) ); CREATE TABLE metadata_resou… | 7 | |
83 | 83 | The datasette.utils module | The datasette.utils module contains various utility functions used by Datasette. As a general rule you should consider anything in this module to be unstable - functions and classes here could change without warning or be removed entirely between Datasette releases, without being mentioned in the release notes. The exception to this rule is anything that is documented here. If you find a need for an undocumented utility function in your own work, consider opening an issue requesting that the function you are using be upgraded to documented and supported status. | 7 | |
84 | 84 | parse_metadata(content) | This function accepts a string containing either JSON or YAML, expected to be of the format described in Metadata . It returns a nested Python dictionary representing the parsed data from that string. If the metadata cannot be parsed as either JSON or YAML the function will raise a utils.BadMetadataError exception. datasette.utils. parse_metadata content : str dict Detects if content is JSON or YAML and parses it appropriately. | 7 | |
85 | 85 | await_me_maybe(value) | Utility function for calling await on a return value if it is awaitable, otherwise returning the value. This is used by Datasette to support plugin hooks that can optionally return awaitable functions. Read more about this function in The “await me maybe” pattern for Python asyncio . async datasette.utils. await_me_maybe value : Any Any If value is callable, call it. If awaitable, await it. Otherwise return it. | 7 | |
86 | 86 | named_parameters(sql) | Derive the list of :named parameters referenced in a SQL query. datasette.utils. named_parameters sql : str List [ str ] Given a SQL statement, return a list of named parameters that are used in the statement e.g. for select * from foo where id=:id this would return ["id"] | 7 | |
87 | 87 | Tilde encoding | Datasette uses a custom encoding scheme in some places, called tilde encoding . This is primarily used for table names and row primary keys, to avoid any confusion between / characters in those values and the Datasette URLs that reference them. Tilde encoding uses the same algorithm as URL percent-encoding , but with the ~ tilde character used in place of % . Any character other than ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz0123456789_- will be replaced by the numeric equivalent preceded by a tilde. For example: / becomes ~2F . becomes ~2E % becomes ~25 ~ becomes ~7E Space becomes + polls/2022.primary becomes polls~2F2022~2Eprimary Note that the space character is a special case: it will be replaced with a + symbol. datasette.utils. tilde_encode s : str str Returns tilde-encoded string - for example /foo/bar -> ~2Ffoo~2Fbar datasette.utils. tilde_decode s : str str Decodes a tilde-encoded string, so ~2Ffoo~2Fbar -> /foo/bar | 7 | |
88 | 88 | datasette.tracer | Running Datasette with --setting trace_debug 1 enables trace debug output, which can then be viewed by adding ?_trace=1 to the query string for any page. You can see an example of this at the bottom of latest.datasette.io/fixtures/facetable?_trace=1 . The JSON output shows full details of every SQL query that was executed to generate the page. The datasette-pretty-traces plugin can be installed to provide a more readable display of this information. You can see a demo of that here . You can add your own custom traces to the JSON output using the trace() context manager. This takes a string that identifies the type of trace being recorded, and records any keyword arguments as additional JSON keys on the resulting trace object. The start and end time, duration and a traceback of where the trace was executed will be automatically attached to the JSON object. This example uses trace to record the start, end and duration of any HTTP GET requests made using the function: from datasette.tracer import trace import httpx async def fetch_url(url): with trace("fetch-url", url=url): async with httpx.AsyncClient() as client: return await client.get(url) | 7 | |
89 | 89 | Tracing child tasks | If your code uses a mechanism such as asyncio.gather() to execute code in additional tasks you may find that some of the traces are missing from the display. You can use the trace_child_tasks() context manager to ensure these child tasks are correctly handled. from datasette import tracer with tracer.trace_child_tasks(): results = await asyncio.gather( # ... async tasks here ) This example uses the register_routes() plugin hook to add a page at /parallel-queries which executes two SQL queries in parallel using asyncio.gather() and returns their results. from datasette import hookimpl from datasette import tracer @hookimpl def register_routes(): async def parallel_queries(datasette): db = datasette.get_database() with tracer.trace_child_tasks(): one, two = await asyncio.gather( db.execute("select 1"), db.execute("select 2"), ) return Response.json( { "one": one.single_value(), "two": two.single_value(), } ) return [ (r"/parallel-queries$", parallel_queries), ] Note that running parallel SQL queries in this way has been known to cause problems in the past , so treat this example with caution. Adding ?_trace=1 will show that the trace covers both of those child tasks. | 7 | |
90 | 90 | Import shortcuts | The following commonly used symbols can be imported directly from the datasette module: from datasette import Response from datasette import Forbidden from datasette import NotFound from datasette import hookimpl from datasette import actor_matches_allow | 7 | |
91 | 91 | Deploying Datasette | The quickest way to deploy a Datasette instance on the internet is to use the datasette publish command, described in Publishing data . This can be used to quickly deploy Datasette to a number of hosting providers including Heroku, Google Cloud Run and Vercel. You can deploy Datasette to other hosting providers using the instructions on this page. | 7 | |
92 | 92 | Deployment fundamentals | Datasette can be deployed as a single datasette process that listens on a port. Datasette is not designed to be run as root, so that process should listen on a higher port such as port 8000. If you want to serve Datasette on port 80 (the HTTP default port) or port 443 (for HTTPS) you should run it behind a proxy server, such as nginx, Apache or HAProxy. The proxy server can listen on port 80/443 and forward traffic on to Datasette. | 7 | |
93 | 93 | Running Datasette using systemd | You can run Datasette on Ubuntu or Debian systems using systemd . First, ensure you have Python 3 and pip installed. On Ubuntu you can use sudo apt-get install python3 python3-pip . You can install Datasette into a virtual environment, or you can install it system-wide. To install system-wide, use sudo pip3 install datasette . Now create a folder for your Datasette databases, for example using mkdir /home/ubuntu/datasette-root . You can copy a test database into that folder like so: cd /home/ubuntu/datasette-root curl -O https://latest.datasette.io/fixtures.db Create a file at /etc/systemd/system/datasette.service with the following contents: [Unit] Description=Datasette After=network.target [Service] Type=simple User=ubuntu Environment=DATASETTE_SECRET= WorkingDirectory=/home/ubuntu/datasette-root ExecStart=datasette serve . -h 127.0.0.1 -p 8000 Restart=on-failure [Install] WantedBy=multi-user.target Add a random value for the DATASETTE_SECRET - this will be used to sign Datasette cookies such as the CSRF token cookie. You can generate a suitable value like so: python3 -c 'import secrets; print(secrets.token_hex(32))' This configuration will run Datasette against all database files contained in the /home/ubuntu/datasette-root directory. If that directory contains a metadata.yml (or .json ) file or a templates/ or plugins/ sub-directory those will automatically be loaded by Datasette - see Configuration directory mode for details. You can start the Datasette process running using the following: sudo systemctl daemon-reload sudo systemctl start datasette.service You will need to restart the Datasette service after making changes to its metadata.json configuration or adding a new database file to that directory. You can do that using: sudo systemctl restart datasette.service Once the … | 7 | |
94 | 94 | Running Datasette using OpenRC | OpenRC is the service manager on non-systemd Linux distributions like Alpine Linux and Gentoo . Create an init script at /etc/init.d/datasette with the following contents: #!/sbin/openrc-run name="datasette" command="datasette" command_args="serve -h 0.0.0.0 /path/to/db.db" command_background=true pidfile="/run/${RC_SVCNAME}.pid" You then need to configure the service to run at boot and start it: rc-update add datasette rc-service datasette start | 7 | |
95 | 95 | Deploying using buildpacks | Some hosting providers such as Heroku , DigitalOcean App Platform and Scalingo support the Buildpacks standard for deploying Python web applications. Deploying Datasette on these platforms requires two files: requirements.txt and Procfile . The requirements.txt file lets the platform know which Python packages should be installed. It should contain datasette at a minimum, but can also list any Datasette plugins you wish to install - for example: datasette datasette-vega The Procfile lets the hosting platform know how to run the command that serves web traffic. It should look like this: web: datasette . -h 0.0.0.0 -p $PORT --cors The $PORT environment variable is provided by the hosting platform. --cors enables CORS requests from JavaScript running on other websites to your domain - omit this if you don't want to allow CORS. You can add additional Datasette Settings options here too. These two files should be enough to deploy Datasette on any host that supports buildpacks. Datasette will serve any SQLite files that are included in the root directory of the application. If you want to build SQLite files or download them as part of the deployment process you can do so using a bin/post_compile file. For example, the following bin/post_compile will download an example database that will then be served by Datasette: wget https://fivethirtyeight.datasettes.com/fivethirtyeight.db simonw/buildpack-datasette-demo is an example GitHub repository showing a Datasette configuration that can be deployed to a buildpack-supporting host. | 7 | |
96 | 96 | Running Datasette behind a proxy | You may wish to run Datasette behind an Apache or nginx proxy, using a path within your existing site. You can use the base_url configuration setting to tell Datasette to serve traffic with a specific URL prefix. For example, you could run Datasette like this: datasette my-database.db --setting base_url /my-datasette/ -p 8009 This will run Datasette with the following URLs: http://127.0.0.1:8009/my-datasette/ - the Datasette homepage http://127.0.0.1:8009/my-datasette/my-database - the page for the my-database.db database http://127.0.0.1:8009/my-datasette/my-database/some_table - the page for the some_table table You can now set your nginx or Apache server to proxy the /my-datasette/ path to this Datasette instance. | 7 | |
97 | 97 | Nginx proxy configuration | Here is an example of an nginx configuration file that will proxy traffic to Datasette: daemon off; events { worker_connections 1024; } http { server { listen 80; location /my-datasette { proxy_pass http://127.0.0.1:8009/my-datasette; proxy_set_header Host $host; } } } You can also use the --uds option to Datasette to listen on a Unix domain socket instead of a port, configuring the nginx upstream proxy like this: daemon off; events { worker_connections 1024; } http { server { listen 80; location /my-datasette { proxy_pass http://datasette/my-datasette; proxy_set_header Host $host; } } upstream datasette { server unix:/tmp/datasette.sock; } } Then run Datasette with datasette --uds /tmp/datasette.sock path/to/database.db --setting base_url /my-datasette/ . | 7 | |
98 | 98 | Apache proxy configuration | For Apache , you can use the ProxyPass directive. First make sure the following lines are uncommented: LoadModule proxy_module lib/httpd/modules/mod_proxy.so LoadModule proxy_http_module lib/httpd/modules/mod_proxy_http.so Then add these directives to proxy traffic: ProxyPass /my-datasette/ http://127.0.0.1:8009/my-datasette/ ProxyPreserveHost On A live demo of Datasette running behind Apache using this proxy setup can be seen at datasette-apache-proxy-demo.datasette.io/prefix/ . The code for that demo can be found in the demos/apache-proxy directory. Using --uds you can use Unix domain sockets similar to the nginx example: ProxyPass /my-datasette/ unix:/tmp/datasette.sock|http://localhost/my-datasette/ The ProxyPreserveHost On directive ensures that the original Host: header from the incoming request is passed through to Datasette. Datasette needs this to correctly assemble links to other pages using the .absolute_url(request, path) method. | 7 | |
99 | 99 | Settings | 7 | ||
100 | 100 | Using --setting | Datasette supports a number of settings. These can be set using the --setting name value option to datasette serve . You can set multiple settings at once like this: datasette mydatabase.db \ --setting default_page_size 50 \ --setting sql_time_limit_ms 3500 \ --setting max_returned_rows 2000 Settings can also be specified in the database.yaml configuration file . | 7 |
Advanced export
JSON shape: default, array, newline-delimited
CREATE VIRTUAL TABLE [sections_fts] USING FTS5 ( [title], [content], tokenize='porter', content=[sections] );