rowid,title,content,sections_fts,rank 667,Registering a plugin for the duration of a test,"When writing tests for plugins you may find it useful to register a test plugin just for the duration of a single test. You can do this using datasette.pm.register() and datasette.pm.unregister() like this: from datasette import hookimpl from datasette.app import Datasette import pytest @pytest.mark.asyncio async def test_using_test_plugin(): class TestPlugin: __name__ = ""TestPlugin"" # Use hookimpl and method names to register hooks @hookimpl def register_routes(self): return [ (r""^/error$"", lambda: 1 / 0), ] datasette = Datasette() try: # The test implementation goes here datasette.pm.register(TestPlugin(), name=""undo"") response = await datasette.client.get(""/error"") assert response.status_code == 500 finally: datasette.pm.unregister(name=""undo"") To reuse the same temporary plugin in multiple tests, you can register it inside a fixture in your conftest.py file like this: from datasette import hookimpl from datasette.app import Datasette import pytest import pytest_asyncio @pytest_asyncio.fixture async def datasette_with_plugin(): class TestPlugin: __name__ = ""TestPlugin"" @hookimpl def register_routes(self): return [ (r""^/error$"", lambda: 1 / 0), ] datasette = Datasette() datasette.pm.register(TestPlugin(), name=""undo"") try: yield datasette finally: datasette.pm.unregister(name=""undo"") Note the yield statement here - this ensures that the finally: block that unregisters the plugin is executed only after the test function itself has completed. Then in a test: @pytest.mark.asyncio async def test_error(datasette_with_plugin): response = await datasette_with_plugin.client.get(""/error"") assert response.status_code == 500",441, 666,Testing outbound HTTP calls with pytest-httpx,"If your plugin makes outbound HTTP calls - for example datasette-auth-github or datasette-import-table - you may need to mock those HTTP requests in your tests. The pytest-httpx package is a useful library for mocking calls. It can be tricky to use with Datasette though since it mocks all HTTPX requests, and Datasette's own testing mechanism uses HTTPX internally. To avoid breaking your tests, you can return [""localhost""] from the non_mocked_hosts() fixture. As an example, here's a very simple plugin which executes an HTTP response and returns the resulting content: from datasette import hookimpl from datasette.utils.asgi import Response import httpx @hookimpl def register_routes(): return [ (r""^/-/fetch-url$"", fetch_url), ] async def fetch_url(datasette, request): if request.method == ""GET"": return Response.html( """"""
"""""".format( request.scope[""csrftoken""]() ) ) vars = await request.post_vars() url = vars[""url""] return Response.text(httpx.get(url).text) Here's a test for that plugin that mocks the HTTPX outbound request: from datasette.app import Datasette import pytest @pytest.fixture def non_mocked_hosts(): # This ensures httpx-mock will not affect Datasette's own # httpx calls made in the tests by datasette.client: return [""localhost""] async def test_outbound_http_call(httpx_mock): httpx_mock.add_response( url=""https://www.example.com/"", text=""Hello world"", ) datasette = Datasette([], memory=True) response = await datasette.client.post( ""/-/fetch-url"", data={""url"": ""https://www.example.com/""}, ) assert response.text == ""Hello world"" outbound_request = httpx_mock.get_request() assert ( outbound_request.url == ""https://www.example.com/"" )",441, 665,Using pytest fixtures,"Pytest fixtures can be used to create initial testable objects which can then be used by multiple tests. A common pattern for Datasette plugins is to create a fixture which sets up a temporary test database and wraps it in a Datasette instance. Here's an example that uses the sqlite-utils library to populate a temporary test database. It also sets the title of that table using a simulated metadata.json configuration: from datasette.app import Datasette import pytest import sqlite_utils @pytest.fixture(scope=""session"") def datasette(tmp_path_factory): db_directory = tmp_path_factory.mktemp(""dbs"") db_path = db_directory / ""test.db"" db = sqlite_utils.Database(db_path) db[""dogs""].insert_all( [ {""id"": 1, ""name"": ""Cleo"", ""age"": 5}, {""id"": 2, ""name"": ""Pancakes"", ""age"": 4}, ], pk=""id"", ) datasette = Datasette( [db_path], metadata={ ""databases"": { ""test"": { ""tables"": { ""dogs"": {""title"": ""Some dogs""} } } } }, ) return datasette @pytest.mark.asyncio async def test_example_table_json(datasette): response = await datasette.client.get( ""/test/dogs.json?_shape=array"" ) assert response.status_code == 200 assert response.json() == [ {""id"": 1, ""name"": ""Cleo"", ""age"": 5}, {""id"": 2, ""name"": ""Pancakes"", ""age"": 4}, ] @pytest.mark.asyncio async def test_example_table_html(datasette): response = await datasette.client.get(""/test/dogs"") assert "">Some dogs"" in response.text Here the datasette() function defines the fixture, which is than automatically passed to the two test functions based on pytest automatically matching their datasette function parameters. The @pytest.fixture(scope=""session"") line here ensures the fixture is reused for the full pytest execution session. This means that the temporary database file will be created once and reused for each test. If you want to create that test database repeatedly for every individual test function, write the fixture function like this instead. You may want to do this if your plugin modifies the database contents in some way: @pytest.fixture def datasette(tmp_path_factory): # This fixture will be executed repeatedly for every test ...",441, 664,Using pdb for errors thrown inside Datasette,"If an exception occurs within Datasette itself during a test, the response returned to your plugin will have a response.status_code value of 500. You can add pdb=True to the Datasette constructor to drop into a Python debugger session inside your test run instead of getting back a 500 response code. This is equivalent to running the datasette command-line tool with the --pdb option. Here's what that looks like in a test function: def test_that_opens_the_debugger_or_errors(): ds = Datasette([db_path], pdb=True) response = await ds.client.get(""/"") If you use this pattern you will need to run pytest with the -s option to avoid capturing stdin/stdout in order to interact with the debugger prompt.",441, 663,Using datasette.client in tests,"The datasette.client mechanism is designed for use in tests. It provides access to a pre-configured HTTPX async client instance that can make GET, POST and other HTTP requests against a Datasette instance from inside a test. A simple test looks like this: @pytest.mark.asyncio async def test_homepage(): ds = Datasette(memory=True) response = await ds.client.get(""/"") html = response.text assert ""

"" in html Or for a JSON API: @pytest.mark.asyncio async def test_actor_is_null(): ds = Datasette(memory=True) response = await ds.client.get(""/-/actor.json"") assert response.json() == {""actor"": None} To make requests as an authenticated actor, create a signed ds_cookie using the datasette.client.actor_cookie() helper function and pass it in cookies= like this: @pytest.mark.asyncio async def test_signed_cookie_actor(): ds = Datasette(memory=True) cookies = {""ds_actor"": ds.client.actor_cookie({""id"": ""root""})} response = await ds.client.get(""/-/actor.json"", cookies=cookies) assert response.json() == {""actor"": {""id"": ""root""}}",441, 662,Setting up a Datasette test instance,"The above example shows the easiest way to start writing tests against a Datasette instance: from datasette.app import Datasette import pytest @pytest.mark.asyncio async def test_plugin_is_installed(): datasette = Datasette(memory=True) response = await datasette.client.get(""/-/plugins.json"") assert response.status_code == 200 Creating a Datasette() instance like this as useful shortcut in tests, but there is one detail you need to be aware of. It's important to ensure that the async method .invoke_startup() is called on that instance. You can do that like this: datasette = Datasette(memory=True) await datasette.invoke_startup() This method registers any startup(datasette) or prepare_jinja2_environment(env, datasette) plugins that might themselves need to make async calls. If you are using await datasette.client.get() and similar methods then you don't need to worry about this - Datasette automatically calls invoke_startup() the first time it handles a request.",441, 661,Testing plugins,"We recommend using pytest to write automated tests for your plugins. If you use the template described in Starting an installable plugin using cookiecutter your plugin will start with a single test in your tests/ directory that looks like this: from datasette.app import Datasette import pytest @pytest.mark.asyncio async def test_plugin_is_installed(): datasette = Datasette(memory=True) response = await datasette.client.get(""/-/plugins.json"") assert response.status_code == 200 installed_plugins = {p[""name""] for p in response.json()} assert ( ""datasette-plugin-template-demo"" in installed_plugins ) This test uses the datasette.client object to exercise a test instance of Datasette. datasette.client is a wrapper around the HTTPX Python library which can imitate HTTP requests using ASGI. This is the recommended way to write tests against a Datasette instance. This test also uses the pytest-asyncio package to add support for async def test functions running under pytest. You can install these packages like so: pip install pytest pytest-asyncio If you are building an installable package you can add them as test dependencies to your pyproject.toml file like this: [project] name = ""datasette-my-plugin"" # ... [project.optional-dependencies] test = [""pytest"", ""pytest-asyncio""] You can then install the test dependencies like so: pip install -e '.[test]' Then run the tests using pytest like so: pytest",441, 660,Binary plugins,"Several Datasette plugins are available that change the way Datasette treats binary data. datasette-render-binary modifies Datasette's default interface to show an automatic guess at what type of binary data is being stored, along with a visual representation of the binary value that displays ASCII strings directly in the interface. datasette-render-images detects common image formats and renders them as images directly in the Datasette interface. datasette-media allows Datasette interfaces to be configured to serve binary files from configured SQL queries, and includes the ability to resize images directly before serving them.",441, 659,Linking to binary downloads,"The .blob output format is used to return binary data. It requires a _blob_column= query string argument specifying which BLOB column should be downloaded, for example: https://latest.datasette.io/fixtures/binary_data/1.blob?_blob_column=data This output format can also be used to return binary data from an arbitrary SQL query. Since such queries do not specify an exact row, an additional ?_blob_hash= parameter can be used to specify the SHA-256 hash of the value that is being linked to. Consider the query select data from binary_data - demonstrated here . That page links to the binary value downloads. Those links look like this: https://latest.datasette.io/fixtures.blob?sql=select+data+from+binary_data&_blob_column=data&_blob_hash=f3088978da8f9aea479ffc7f631370b968d2e855eeb172bea7f6c7a04262bb6d These .blob links are also returned in the .csv exports Datasette provides for binary tables and queries, since the CSV format does not have a mechanism for representing binary data.",441, 658,Binary data,"SQLite tables can contain binary data in BLOB columns. Datasette includes special handling for these binary values. The Datasette interface detects binary values and provides a link to download their content, for example on https://latest.datasette.io/fixtures/binary_data Binary data is represented in .json exports using Base64 encoding. https://latest.datasette.io/fixtures/binary_data.json?_shape=array [ { ""rowid"": 1, ""data"": { ""$base64"": true, ""encoded"": ""FRwCx60F/g=="" } }, { ""rowid"": 2, ""data"": { ""$base64"": true, ""encoded"": ""FRwDx60F/g=="" } }, { ""rowid"": 3, ""data"": null } ]",441, 657,register_events(datasette),"datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) . This hook should return a list of Event subclasses that represent custom events that the plugin might send to the datasette.track_event() method. This example registers event subclasses for ban-user and unban-user events: from dataclasses import dataclass from datasette import hookimpl, Event @dataclass class BanUserEvent(Event): name = ""ban-user"" user: dict @dataclass class UnbanUserEvent(Event): name = ""unban-user"" user: dict @hookimpl def register_events(): return [BanUserEvent, UnbanUserEvent] The plugin can then call datasette.track_event(...) to send a ban-user event: await datasette.track_event( BanUserEvent(user={""id"": 1, ""username"": ""cleverbot""}) )",441, 656,"track_event(datasette, event)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) . event - Event Information about the event, represented as an instance of a subclass of the Event base class. This hook will be called any time an event is tracked by code that calls the datasette.track_event(...) internal method. The event object will always have the following properties: name : a string representing the name of the event, for example logout or create-table . actor : a dictionary representing the actor that triggered the event, or None if the event was not triggered by an actor. created : a datatime.datetime object in the timezone.utc timezone representing the time the event object was created. Other properties on the event will be available depending on the type of event. You can also access those as a dictionary using event.properties() . The events fired by Datasette core are documented here . This example plugin logs details of all events to standard error: from datasette import hookimpl import json import sys @hookimpl def track_event(event): name = event.name actor = event.actor properties = event.properties() msg = json.dumps( { ""name"": name, ""actor"": actor, ""properties"": properties, } ) print(msg, file=sys.stderr, flush=True) The function can also return an async function which will be awaited. This is useful for writing to a database. This example logs events to a datasette_events table in a database called events . It uses the startup(datasette) hook to create that table if it does not exist. from datasette import hookimpl import json @hookimpl def startup(datasette): async def inner(): db = datasette.get_database(""events"") await db.execute_write( """""" create table if not exists datasette_events ( id integer primary key, event_type text, created text, actor text, properties text ) """""" ) return inner @hookimpl def track_event(datasette, event): async def inner(): db = datasette.get_database(""events"") properties = event.properties() await db.execute_write( """""" insert into datasette_events (event_type, created, actor, properties) values (?, strftime('%Y-%m-%d %H:%M:%S', 'now'), ?, ?) """""", ( event.name, json.dumps(event.actor), json.dumps(properties), ), ) return inner Example: datasette-events-db",441, 655,Event tracking,"Datasette includes an internal mechanism for tracking notable events. This can be used for analytics, but can also be used by plugins that want to listen out for when key events occur (such as a table being created) and take action in response. Plugins can register to receive events using the track_event plugin hook. They can also define their own events for other plugins to receive using the register_events() plugin hook , combined with calls to the datasette.track_event() internal method .",441, 654,"top_canned_query(datasette, request, database, query_name)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) . request - Request object The current HTTP request. database - string The name of the database. query_name - string The name of the canned query. Returns HTML to be displayed at the top of the canned query page.",441, 653,"top_query(datasette, request, database, sql)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) . request - Request object The current HTTP request. database - string The name of the database. sql - string The SQL query. Returns HTML to be displayed at the top of the query results page.",441, 652,"top_row(datasette, request, database, table, row)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) . request - Request object The current HTTP request. database - string The name of the database. table - string The name of the table. row - sqlite.Row The SQLite row object being displayed. Returns HTML to be displayed at the top of the row page.",441, 651,"top_table(datasette, request, database, table)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) . request - Request object The current HTTP request. database - string The name of the database. table - string The name of the table. Returns HTML to be displayed at the top of the table page.",441, 650,"top_database(datasette, request, database)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) . request - Request object The current HTTP request. database - string The name of the database. Returns HTML to be displayed at the top of the database page.",441, 649,"top_homepage(datasette, request)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) . request - Request object The current HTTP request. Returns HTML to be displayed at the top of the Datasette homepage.",441, 648,Template slots,"The following set of plugin hooks can be used to return extra HTML content that will be inserted into the corresponding page, directly below the

heading. Multiple plugins can contribute content here. The order in which it is displayed can be controlled using Pluggy's call time order options . Each of these plugin hooks can return either a string or an awaitable function that returns a string.",441, 647,"homepage_actions(datasette, actor, request)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) , or to execute SQL queries. actor - dictionary or None The currently authenticated actor . request - Request object The current HTTP request. Populates an actions menu on the top-level index homepage of the Datasette instance. This example adds a link an imagined tool for editing the homepage, only for signed in users: from datasette import hookimpl @hookimpl def homepage_actions(datasette, actor): if actor: return [ { ""href"": datasette.urls.path( ""/-/customize-homepage"" ), ""label"": ""Customize homepage"", } ]",441, 646,"database_actions(datasette, actor, database, request)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) , or to execute SQL queries. actor - dictionary or None The currently authenticated actor . database - string The name of the database. request - Request object The current HTTP request. Populates an actions menu on the database page. This example adds a new database action for creating a table, if the user has the edit-schema permission: from datasette import hookimpl from datasette.resources import DatabaseResource @hookimpl def database_actions(datasette, actor, database): async def inner(): if not await datasette.allowed( actor, ""edit-schema"", resource=DatabaseResource(""database""), ): return [] return [ { ""href"": datasette.urls.path( ""/-/edit-schema/{}/-/create"".format( database ) ), ""label"": ""Create a table"", } ] return inner Example: datasette-graphql , datasette-edit-schema",441, 645,"row_actions(datasette, actor, request, database, table, row)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) , or to execute SQL queries. actor - dictionary or None The currently authenticated actor . request - Request object or None The current HTTP request. database - string The name of the database. table - string The name of the table. row - sqlite.Row The SQLite row object being displayed on the page. Return links for the ""Row actions"" menu shown at the top of the row page. This example displays the row in JSON plus some additional debug information if the user is signed in: from datasette import hookimpl @hookimpl def row_actions(datasette, database, table, actor, row): if actor: return [ { ""href"": datasette.urls.instance(), ""label"": f""Row details for {actor['id']}"", ""description"": json.dumps( dict(row), default=repr ), }, ] Example: datasette-enrichments",441, 644,"query_actions(datasette, actor, database, query_name, request, sql, params)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) , or to execute SQL queries. actor - dictionary or None The currently authenticated actor . database - string The name of the database. query_name - string or None The name of the canned query, or None if this is an arbitrary SQL query. request - Request object The current HTTP request. sql - string The SQL query being executed params - dictionary The parameters passed to the SQL query, if any. Populates a ""Query actions"" menu on the canned query and arbitrary SQL query pages. This example adds a new query action linking to a page for explaining a query: from datasette import hookimpl import urllib @hookimpl def query_actions(datasette, database, query_name, sql): # Don't explain an explain if sql.lower().startswith(""explain""): return return [ { ""href"": datasette.urls.database(database) + ""?"" + urllib.parse.urlencode( { ""sql"": ""explain "" + sql, } ), ""label"": ""Explain this query"", ""description"": ""Get a summary of how SQLite executes the query"", }, ] Example: datasette-create-view",441, 643,"view_actions(datasette, actor, database, view, request)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) , or to execute SQL queries. actor - dictionary or None The currently authenticated actor . database - string The name of the database. view - string The name of the SQL view. request - Request object or None The current HTTP request. This can be None if the request object is not available. Like table_actions(datasette, actor, database, table, request) but for SQL views.",441, 642,"table_actions(datasette, actor, database, table, request)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) , or to execute SQL queries. actor - dictionary or None The currently authenticated actor . database - string The name of the database. table - string The name of the table. request - Request object or None The current HTTP request. This can be None if the request object is not available. This example adds a new table action if the signed in user is ""root"" : from datasette import hookimpl @hookimpl def table_actions(datasette, actor, database, table): if actor and actor.get(""id"") == ""root"": return [ { ""href"": datasette.urls.path( ""/-/edit-schema/{}/{}"".format( database, table ) ), ""label"": ""Edit schema for this table"", ""description"": ""Add, remove, rename or alter columns for this table."", } ] Example: datasette-graphql",441, 641,Action hooks,"Action hooks can be used to add items to the action menus that appear at the top of different pages within Datasette. Unlike menu_links() , actions which are displayed on every page, actions should only be relevant to the page the user is currently viewing. Each of these hooks should return return a list of {""href"": ""..."", ""label"": ""...""} menu items, with optional ""description"": ""..."" keys describing each action in more detail. They can alternatively return an async def awaitable function which, when called, returns a list of those menu items.",441, 640,"menu_links(datasette, actor, request)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) , or to execute SQL queries. actor - dictionary or None The currently authenticated actor . request - Request object or None The current HTTP request. This can be None if the request object is not available. This hook allows additional items to be included in the menu displayed by Datasette's top right menu icon. The hook should return a list of {""href"": ""..."", ""label"": ""...""} menu items. These will be added to the menu. It can alternatively return an async def awaitable function which returns a list of menu items. This example adds a new menu item but only if the signed in user is ""root"" : from datasette import hookimpl @hookimpl def menu_links(datasette, actor): if actor and actor.get(""id"") == ""root"": return [ { ""href"": datasette.urls.path( ""/-/edit-schema"" ), ""label"": ""Edit schema"", }, ] Using datasette.urls here ensures that links in the menu will take the base_url setting into account. Examples: datasette-search-all , datasette-graphql",441, 639,"skip_csrf(datasette, scope)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) , or to execute SQL queries. scope - dictionary The ASGI scope for the incoming HTTP request. This hook can be used to skip CSRF protection for a specific incoming request. For example, you might have a custom path at /submit-comment which is designed to accept comments from anywhere, whether or not the incoming request originated on the site and has an accompanying CSRF token. This example will disable CSRF protection for that specific URL path: from datasette import hookimpl @hookimpl def skip_csrf(scope): return scope[""path""] == ""/submit-comment"" If any of the currently active skip_csrf() plugin hooks return True , CSRF protection will be skipped for the request.",441, 638,"handle_exception(datasette, request, exception)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) , or to render templates or execute SQL queries. request - Request object The current HTTP request. exception - Exception The exception that was raised. This hook is called any time an unexpected exception is raised. You can use it to record the exception. If your handler returns a Response object it will be returned to the client in place of the default Datasette error page. The handler can return a response directly, or it can return return an awaitable function that returns a response. This example logs an error to Sentry and then renders a custom error page: from datasette import hookimpl, Response import sentry_sdk @hookimpl def handle_exception(datasette, exception): sentry_sdk.capture_exception(exception) async def inner(): return Response.html( await datasette.render_template( ""custom_error.html"", request=request ) ) return inner Example: datasette-sentry",441, 637,"forbidden(datasette, request, message)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) , or to render templates or execute SQL queries. request - Request object The current HTTP request. message - string A message hinting at why the request was forbidden. Plugins can use this to customize how Datasette responds when a 403 Forbidden error occurs - usually because a page failed a permission check, see Permissions . If a plugin hook wishes to react to the error, it should return a Response object . This example returns a redirect to a /-/login page: from datasette import hookimpl from urllib.parse import urlencode @hookimpl def forbidden(request, message): return Response.redirect( ""/-/login?="" + urlencode({""message"": message}) ) The function can alternatively return an awaitable function if it needs to make any asynchronous method calls. This example renders a template: from datasette import hookimpl, Response @hookimpl def forbidden(datasette): async def inner(): return Response.html( await datasette.render_template( ""render_message.html"", request=request ) ) return inner",441, 636,register_magic_parameters(datasette),"datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) . Magic parameters can be used to add automatic parameters to canned queries . This plugin hook allows additional magic parameters to be defined by plugins. Magic parameters all take this format: _prefix_rest_of_parameter . The prefix indicates which magic parameter function should be called - the rest of the parameter is passed as an argument to that function. To register a new function, return it as a tuple of (string prefix, function) from this hook. The function you register should take two arguments: key and request , where key is the rest_of_parameter portion of the parameter and request is the current Request object . This example registers two new magic parameters: :_request_http_version returning the HTTP version of the current request, and :_uuid_new which returns a new UUID. It also registers an :_asynclookup_key parameter, demonstrating that these functions can be asynchronous: from datasette import hookimpl from uuid import uuid4 def uuid(key, request): if key == ""new"": return str(uuid4()) else: raise KeyError def request(key, request): if key == ""http_version"": return request.scope[""http_version""] else: raise KeyError async def asynclookup(key, request): return await do_something_async(key) @hookimpl def register_magic_parameters(datasette): return [ (""request"", request), (""uuid"", uuid), (""asynclookup"", asynclookup), ]",441, 635,Default deny with an exception,"Combine a root-level deny with a specific table allow for trusted users. The resolver will automatically apply the most specific rule. from datasette import hookimpl from datasette.permissions import PermissionSQL TRUSTED = {""alice"", ""bob""} @hookimpl def permission_resources_sql(datasette, actor, action): if action != ""view-table"": return None actor_id = (actor or {}).get(""id"") if actor_id not in TRUSTED: return PermissionSQL( sql="""""" SELECT NULL AS parent, NULL AS child, 0 AS allow, 'default deny view-table' AS reason """""", ) return PermissionSQL( sql="""""" SELECT NULL AS parent, NULL AS child, 0 AS allow, 'default deny view-table' AS reason UNION ALL SELECT 'reports' AS parent, 'daily_metrics' AS child, 1 AS allow, 'trusted user access' AS reason """""", params={""actor_id"": actor_id}, ) The UNION ALL ensures the deny rule is always present, while the second row adds the exception for trusted users.",441, 634,Read permissions from a custom table,"This example stores grants in an internal table called permission_grants with columns (actor_id, action, parent, child, allow, reason) . from datasette import hookimpl from datasette.permissions import PermissionSQL @hookimpl def permission_resources_sql(datasette, actor, action): if not actor: return None return PermissionSQL( sql="""""" SELECT parent, child, allow, COALESCE(reason, 'permission_grants table') AS reason FROM permission_grants WHERE actor_id = :grants_actor_id AND action = :grants_action """""", params={ ""grants_actor_id"": actor.get(""id""), ""grants_action"": action, }, )",441, 633,Restrict execute-sql to a database prefix,"Only allow execute-sql against databases whose name begins with analytics_ . This shows how to use parameters that the permission resolver will pass through to the SQL snippet. from datasette import hookimpl from datasette.permissions import PermissionSQL @hookimpl def permission_resources_sql(datasette, actor, action): if action != ""execute-sql"": return None return PermissionSQL( sql="""""" SELECT parent, NULL AS child, 1 AS allow, 'execute-sql allowed for analytics_*' AS reason FROM catalog_databases WHERE database_name LIKE :analytics_prefix """""", params={ ""analytics_prefix"": ""analytics_%"", }, )",441, 632,Allow Alice to view a specific table,"This plugin grants the actor with id == ""alice"" permission to perform the view-table action against the sales table inside the accounting database. from datasette import hookimpl from datasette.permissions import PermissionSQL @hookimpl def permission_resources_sql(datasette, actor, action): if action != ""view-table"": return None if not actor or actor.get(""id"") != ""alice"": return None return PermissionSQL( sql="""""" SELECT 'accounting' AS parent, 'sales' AS child, 1 AS allow, 'alice can view accounting/sales' AS reason """""", )",441, 631,Permission plugin examples,"These snippets show how to use the new permission_resources_sql hook to contribute rows to the action-based permission resolver. Each hook receives the current actor dictionary (or None ) and must return None or an instance or list of datasette.permissions.PermissionSQL (or a coroutine that resolves to that).",441, 630,"permission_resources_sql(datasette, actor, action)","datasette - Datasette class Access to the Datasette instance. actor - dictionary or None The current actor dictionary. None for anonymous requests. action - string The permission action being evaluated. Examples include ""view-table"" or ""insert-row"" . Return value A datasette.permissions.PermissionSQL object, None or an iterable of PermissionSQL objects. Datasette's action-based permission resolver calls this hook to gather SQL rows describing which resources an actor may access ( allow = 1 ) or should be denied ( allow = 0 ) for a specific action. Each SQL snippet should return parent , child , allow and reason columns. Parameter naming convention: Plugin parameters in PermissionSQL.params should use unique names to avoid conflicts with other plugins. The recommended convention is to prefix parameters with your plugin's source name (e.g., myplugin_user_id ). The system reserves these parameter names: :actor , :actor_id , :action , and :filter_parent . You can also use return PermissionSQL.allow(reason=""reason goes here"") or PermissionSQL.deny(reason=""reason goes here"") as shortcuts for simple root-level allow or deny rules. These will create SQL snippets that look like this: SELECT NULL AS parent, NULL AS child, 1 AS allow, 'reason goes here' AS reason Or 0 AS allow for denies.",441, 629,"filters_from_request(request, database, table, datasette)","request - Request object The current HTTP request. database - string The name of the database. table - string The name of the table. datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) , or to execute SQL queries. This hook runs on the table page, and can influence the where clause of the SQL query used to populate that page, based on query string arguments on the incoming request. The hook should return an instance of datasette.filters.FilterArguments which has one required and three optional arguments: return FilterArguments( where_clauses=[""id > :max_id""], params={""max_id"": 5}, human_descriptions=[""max_id is greater than 5""], extra_context={}, ) The arguments to the FilterArguments class constructor are as follows: where_clauses - list of strings, required A list of SQL fragments that will be inserted into the SQL query, joined by the and operator. These can include :named parameters which will be populated using data in params . params - dictionary, optional Additional keyword arguments to be used when the query is executed. These should match any :arguments in the where clauses. human_descriptions - list of strings, optional These strings will be included in the human-readable description at the top of the page and the page . extra_context - dictionary, optional Additional context variables that should be made available to the table.html template when it is rendered. This example plugin causes 0 results to be returned if ?_nothing=1 is added to the URL: from datasette import hookimpl from datasette.filters import FilterArguments @hookimpl def filters_from_request(self, request): if request.args.get(""_nothing""): return FilterArguments( [""1 = 0""], human_descriptions=[""NOTHING""] ) Example: datasette-leaflet-freedraw",441, 628,"jinja2_environment_from_request(datasette, request, env)","datasette - Datasette class A Datasette instance. request - Request object or None The current HTTP request, if one is available. env - Environment The Jinja2 environment that will be used to render the current page. This hook can be used to return a customized Jinja environment based on the incoming request. If you want to run a single Datasette instance that serves different content for different domains, you can do so like this: from datasette import hookimpl from jinja2 import ChoiceLoader, FileSystemLoader @hookimpl def jinja2_environment_from_request(request, env): if request and request.host == ""www.niche-museums.com"": return env.overlay( loader=ChoiceLoader( [ FileSystemLoader( ""/mnt/niche-museums/templates"" ), env.loader, ] ), enable_async=True, ) return env This uses the Jinja overlay() method to create a new environment identical to the default environment except for having a different template loader, which first looks in the /mnt/niche-museums/templates directory before falling back on the default loader.",441, 627,"actors_from_ids(datasette, actor_ids)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) , or to execute SQL queries. actor_ids - list of strings or integers The actor IDs to look up. The hook must return a dictionary that maps the incoming actor IDs to their full dictionary representation. Some plugins that implement social features may store the ID of the actor that performed an action - added a comment, bookmarked a table or similar - and then need a way to resolve those IDs into display-friendly actor dictionaries later on. The await datasette.actors_from_ids(actor_ids) internal method can be used to look up actors from their IDs. It will dispatch to the first plugin that implements this hook. Unlike other plugin hooks, this only uses the first implementation of the hook to return a result. You can expect users to only have a single plugin installed that implements this hook. If no plugin is installed, Datasette defaults to returning actors that are just {""id"": actor_id} . The hook can return a dictionary or an awaitable function that then returns a dictionary. This example implementation returns actors from a database table: from datasette import hookimpl @hookimpl def actors_from_ids(datasette, actor_ids): db = datasette.get_database(""actors"") async def inner(): sql = ""select id, name from actors where id in ({})"".format( "", "".join(""?"" for _ in actor_ids) ) actors = {} for row in (await db.execute(sql, actor_ids)).rows: actor = dict(row) actors[actor[""id""]] = actor return actors return inner The returned dictionary from this example looks like this: { ""1"": {""id"": ""1"", ""name"": ""Tony""}, ""2"": {""id"": ""2"", ""name"": ""Tina""}, } These IDs could be integers or strings, depending on how the actors used by the Datasette instance are configured. Example: datasette-remote-actors",441, 626,"actor_from_request(datasette, request)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) , or to execute SQL queries. request - Request object The current HTTP request. This is part of Datasette's authentication and permissions system . The function should attempt to authenticate an actor (either a user or an API actor of some sort) based on information in the request. If it cannot authenticate an actor, it should return None , otherwise it should return a dictionary representing that actor. Once a plugin has returned an actor from this hook other plugins will be ignored. Here's an example that authenticates the actor based on an incoming API key: from datasette import hookimpl import secrets SECRET_KEY = ""this-is-a-secret"" @hookimpl def actor_from_request(datasette, request): authorization = ( request.headers.get(""authorization"") or """" ) expected = ""Bearer {}"".format(SECRET_KEY) if secrets.compare_digest(authorization, expected): return {""id"": ""bot""} If you install this in your plugins directory you can test it like this: curl -H 'Authorization: Bearer this-is-a-secret' http://localhost:8003/-/actor.json Instead of returning a dictionary, this function can return an awaitable function which itself returns either None or a dictionary. This is useful for authentication functions that need to make a database query - for example: from datasette import hookimpl @hookimpl def actor_from_request(datasette, request): async def inner(): token = request.args.get(""_token"") if not token: return None # Look up ?_token=xxx in sessions table result = await datasette.get_database().execute( ""select count(*) from sessions where token = ?"", [token], ) if result.first()[0]: return {""token"": token} else: return None return inner Examples: datasette-auth-tokens , datasette-auth-passwords",441, 625,"canned_queries(datasette, database, actor)","datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) , or to execute SQL queries. database - string The name of the database. actor - dictionary or None The currently authenticated actor . Use this hook to return a dictionary of additional canned query definitions for the specified database. The return value should be the same shape as the JSON described in the canned query documentation. from datasette import hookimpl @hookimpl def canned_queries(datasette, database): if database == ""mydb"": return { ""my_query"": { ""sql"": ""select * from my_table where id > :min_id"" } } The hook can alternatively return an awaitable function that returns a list. Here's an example that returns queries that have been stored in the saved_queries database table, if one exists: from datasette import hookimpl @hookimpl def canned_queries(datasette, database): async def inner(): db = datasette.get_database(database) if await db.table_exists(""saved_queries""): results = await db.execute( ""select name, sql from saved_queries"" ) return { result[""name""]: {""sql"": result[""sql""]} for result in results } return inner The actor parameter can be used to include the currently authenticated actor in your decision. Here's an example that returns saved queries that were saved by that actor: from datasette import hookimpl @hookimpl def canned_queries(datasette, database, actor): async def inner(): db = datasette.get_database(database) if actor is not None and await db.table_exists( ""saved_queries"" ): results = await db.execute( ""select name, sql from saved_queries where actor_id = :id"", {""id"": actor[""id""]}, ) return { result[""name""]: {""sql"": result[""sql""]} for result in results } return inner Example: datasette-saved-queries",441, 624,startup(datasette),"This hook fires when the Datasette application server first starts up. Here is an example that validates required plugin configuration. The server will fail to start and show an error if the validation check fails: @hookimpl def startup(datasette): config = datasette.plugin_config(""my-plugin"") or {} assert ( ""required-setting"" in config ), ""my-plugin requires setting required-setting"" You can also return an async function, which will be awaited on startup. Use this option if you need to execute any database queries, for example this function which creates the my_table database table if it does not yet exist: @hookimpl def startup(datasette): async def inner(): db = datasette.get_database() if ""my_table"" not in await db.table_names(): await db.execute_write( """""" create table my_table (mycol text) """""" ) return inner Potential use-cases: Run some initialization code for the plugin Create database tables that a plugin needs on startup Validate the configuration for a plugin on startup, and raise an error if it is invalid If you are writing unit tests for a plugin that uses this hook and doesn't exercise Datasette by sending any simulated requests through it you will need to explicitly call await ds.invoke_startup() in your tests. An example: @pytest.mark.asyncio async def test_my_plugin(): ds = Datasette() await ds.invoke_startup() # Rest of test goes here Examples: datasette-saved-queries , datasette-init",441, 623,asgi_wrapper(datasette),"Return an ASGI middleware wrapper function that will be applied to the Datasette ASGI application. This is a very powerful hook. You can use it to manipulate the entire Datasette response, or even to configure new URL routes that will be handled by your own custom code. You can write your ASGI code directly against the low-level specification, or you can use the middleware utilities provided by an ASGI framework such as Starlette . This example plugin adds a x-databases HTTP header listing the currently attached databases: from datasette import hookimpl from functools import wraps @hookimpl def asgi_wrapper(datasette): def wrap_with_databases_header(app): @wraps(app) async def add_x_databases_header( scope, receive, send ): async def wrapped_send(event): if event[""type""] == ""http.response.start"": original_headers = ( event.get(""headers"") or [] ) event = { ""type"": event[""type""], ""status"": event[""status""], ""headers"": original_headers + [ [ b""x-databases"", "", "".join( datasette.databases.keys() ).encode(""utf-8""), ] ], } await send(event) await app(scope, receive, wrapped_send) return add_x_databases_header return wrap_with_databases_header Examples: datasette-cors , datasette-pyinstrument , datasette-total-page-time",441, 622,The ,"The resources_sql() classmethod returns a SQL query that lists all resources of that type that exist in the system. This query is used by Datasette to efficiently check permissions across multiple resources at once. When a user requests a list of resources (like tables, documents, or other entities), Datasette uses this SQL to: Get all resources of this type from your data catalog Combine it with permission rules from the permission_resources_sql hook Use SQL joins and filtering to determine which resources the actor can access Return only the permitted resources The SQL query must return exactly two columns: parent - The parent identifier (e.g., database name, collection name), or NULL for top-level resources child - The child identifier (e.g., table name, document ID), or NULL for parent-only resources For example, if you're building a document management plugin with collections and documents stored in a documents table, your resources_sql() might look like: @classmethod def resources_sql(cls) -> str: return """""" SELECT collection_name AS parent, document_id AS child FROM documents """""" This tells Datasette ""here's how to find all documents in the system - look in the documents table and get the collection name and document ID for each one."" The permission system then uses this query along with rules from plugins to determine which documents each user can access, all efficiently in SQL rather than loading everything into Python.",441, 621,register_actions(datasette),"If your plugin needs to register actions that can be checked with Datasette's new resource-based permission system, return a list of those actions from this hook. Actions define what operations can be performed on resources (like viewing a table, executing SQL, or custom plugin actions). from datasette import hookimpl from datasette.permissions import Action, Resource class DocumentCollectionResource(Resource): """"""A collection of documents."""""" name = ""document-collection"" parent_name = None def __init__(self, collection: str): super().__init__(parent=collection, child=None) @classmethod def resources_sql(cls) -> str: return """""" SELECT collection_name AS parent, NULL AS child FROM document_collections """""" class DocumentResource(Resource): """"""A document in a collection."""""" name = ""document"" parent_name = ""document-collection"" def __init__(self, collection: str, document: str): super().__init__(parent=collection, child=document) @classmethod def resources_sql(cls) -> str: return """""" SELECT collection_name AS parent, document_id AS child FROM documents """""" @hookimpl def register_actions(datasette): return [ Action( name=""list-documents"", abbr=""ld"", description=""List documents in a collection"", resource_class=DocumentCollectionResource, ), Action( name=""view-document"", abbr=""vdoc"", description=""View document"", resource_class=DocumentResource, ), Action( name=""edit-document"", abbr=""edoc"", description=""Edit document"", resource_class=DocumentResource, ), ] The fields of the Action dataclass are as follows: name - string The name of the action, e.g. view-document . This should be unique across all plugins. abbr - string or None An abbreviation of the action, e.g. vdoc . This is optional. Since this needs to be unique across all installed plugins it's best to choose carefully or omit it entirely (same as setting it to None .) description - string or None A human-readable description of what the action allows you to do. resource_class - type[Resource] or None The Resource subclass that defines what kind of resource this action applies to. Omit this (or set to None ) for global actions that apply only at the instance level with no associated resources (like debug-menu or permissions-debug ). Your Resource subclass must: Define a name class attribute (e.g., ""document"" ) Define a parent_class class attribute ( None for top-level resources like databases, or the parent Resource subclass for child resources) Implement a resources_sql() classmethod that returns SQL returning all resources as (parent, child) columns Have an __init__ method that accepts appropriate parameters and calls super().__init__(parent=..., child=...)",441, 620,register_facet_classes(),"Return a list of additional Facet subclasses to be registered. The design of this plugin hook is unstable and may change. See issue 830 . Each Facet subclass implements a new type of facet operation. The class should look like this: class SpecialFacet(Facet): # This key must be unique across all facet classes: type = ""special"" async def suggest(self): # Use self.sql and self.params to suggest some facets suggested_facets = [] suggested_facets.append( { ""name"": column, # Or other unique name # Construct the URL that will enable this facet: ""toggle_url"": self.ds.absolute_url( self.request, path_with_added_args( self.request, {""_facet"": column} ), ), } ) return suggested_facets async def facet_results(self): # This should execute the facet operation and return results, again # using self.sql and self.params as the starting point facet_results = [] facets_timed_out = [] facet_size = self.get_facet_size() # Do some calculations here... for column in columns_selected_for_facet: try: facet_results_values = [] # More calculations... facet_results_values.append( { ""value"": value, ""label"": label, ""count"": count, ""toggle_url"": self.ds.absolute_url( self.request, toggle_path ), ""selected"": selected, } ) facet_results.append( { ""name"": column, ""results"": facet_results_values, ""truncated"": len(facet_rows_results) > facet_size, } ) except QueryInterrupted: facets_timed_out.append(column) return facet_results, facets_timed_out See datasette/facets.py for examples of how these classes can work. The plugin hook can then be used to register the new facet class like this: @hookimpl def register_facet_classes(): return [SpecialFacet]",441, 619,register_commands(cli),"cli - the root Datasette Click command group Use this to register additional CLI commands Register additional CLI commands that can be run using datsette yourcommand ... . This provides a mechanism by which plugins can add new CLI commands to Datasette. This example registers a new datasette verify file1.db file2.db command that checks if the provided file paths are valid SQLite databases: from datasette import hookimpl import click import sqlite3 @hookimpl def register_commands(cli): @cli.command() @click.argument( ""files"", type=click.Path(exists=True), nargs=-1 ) def verify(files): ""Verify that files can be opened by Datasette"" for file in files: conn = sqlite3.connect(str(file)) try: conn.execute(""select * from sqlite_master"") except sqlite3.DatabaseError: raise click.ClickException( ""Invalid database: {}"".format(file) ) The new command can then be executed like so: datasette verify fixtures.db Help text (from the docstring for the function plus any defined Click arguments or options) will become available using: datasette verify --help Plugins can register multiple commands by making multiple calls to the @cli.command() decorator. Consult the Click documentation for full details on how to build a CLI command, including how to define arguments and options. Note that register_commands() plugins cannot used with the --plugins-dir mechanism - they need to be installed into the same virtual environment as Datasette using pip install . Provided it has a pyproject.toml file (see Packaging a plugin ) you can run pip install directly against the directory in which you are developing your plugin like so: pip install -e path/to/my/datasette-plugin Examples: datasette-auth-passwords , datasette-verify",441, 618,register_routes(datasette),"datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) Register additional view functions to execute for specified URL routes. Return a list of (regex, view_function) pairs, something like this: from datasette import hookimpl, Response import html async def hello_from(request): name = request.url_vars[""name""] return Response.html( ""Hello from {}"".format(html.escape(name)) ) @hookimpl def register_routes(): return [(r""^/hello-from/(?P<name>.*)$"", hello_from)] The view functions can take a number of different optional arguments. The corresponding argument will be passed to your function depending on its named parameters - a form of dependency injection. The optional view function arguments are as follows: datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) , or to execute SQL queries. request - Request object The current HTTP request. scope - dictionary The incoming ASGI scope dictionary. send - function The ASGI send function. receive - function The ASGI receive function. The view function can be a regular function or an async def function, depending on if it needs to use any await APIs. The function can either return a Response class or it can return nothing and instead respond directly to the request using the ASGI send function (for advanced uses only). It can also raise the datasette.NotFound exception to return a 404 not found error, or the datasette.Forbidden exception for a 403 forbidden. See Designing URLs for your plugin for tips on designing the URL routes used by your plugin. Examples: datasette-auth-github , datasette-psutil",441, 617,register_output_renderer(datasette),"datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) Registers a new output renderer, to output data in a custom format. The hook function should return a dictionary, or a list of dictionaries, of the following shape: @hookimpl def register_output_renderer(datasette): return { ""extension"": ""test"", ""render"": render_demo, ""can_render"": can_render_demo, # Optional } This will register render_demo to be called when paths with the extension .test (for example /database.test , /database/table.test , or /database/table/row.test ) are requested. render_demo is a Python function. It can be a regular function or an async def render_demo() awaitable function, depending on if it needs to make any asynchronous calls. can_render_demo is a Python function (or async def function) which accepts the same arguments as render_demo but just returns True or False . It lets Datasette know if the current SQL query can be represented by the plugin - and hence influence if a link to this output format is displayed in the user interface. If you omit the ""can_render"" key from the dictionary every query will be treated as being supported by the plugin. When a request is received, the ""render"" callback function is called with zero or more of the following arguments. Datasette will inspect your callback function and pass arguments that match its function signature. datasette - Datasette class For accessing plugin configuration and executing queries. columns - list of strings The names of the columns returned by this query. rows - list of sqlite3.Row objects The rows returned by the query. sql - string The SQL query that was executed. query_name - string or None If this was the execution of a canned query , the name of that query. database - string The name of the database. table - string or None The table or view, if one is being rendered. request - Request object The current HTTP request. error - string or None If an error occurred this string will contain the error message. truncated - bool or None If the query response was truncated - for example a SQL query returning more than 1,000 results where pagination is not available - this will be True . view_name - string The name of the current view being called. index , database , table , and row are the most important ones. The callback function can return None , if it is unable to render the data, or a Response class that will be returned to the caller. It can also return a dictionary with the following keys. This format is deprecated as-of Datasette 0.49 and will be removed by Datasette 1.0. body - string or bytes, optional The response body, default empty content_type - string, optional The Content-Type header, default text/plain status_code - integer, optional The HTTP status code, default 200 headers - dictionary, optional Extra HTTP headers to be returned in the response. An example of an output renderer callback function: def render_demo(): return Response.text(""Hello World"") Here is a more complex example: async def render_demo(datasette, columns, rows): db = datasette.get_database() result = await db.execute(""select sqlite_version()"") first_row = "" | "".join(columns) lines = [first_row] lines.append(""="" * len(first_row)) for row in rows: lines.append("" | "".join(row)) return Response( ""\n"".join(lines), content_type=""text/plain; charset=utf-8"", headers={""x-sqlite-version"": result.first()[0]}, ) And here is an example can_render function which returns True only if the query results contain the columns atom_id , atom_title and atom_updated : def can_render_demo(columns): return { ""atom_id"", ""atom_title"", ""atom_updated"", }.issubset(columns) Examples: datasette-atom , datasette-ics , datasette-geojson , datasette-copyable",441, 616,"render_cell(row, value, column, table, database, datasette, request)","Lets you customize the display of values within table cells in the HTML table view. row - sqlite.Row The SQLite row object that the value being rendered is part of value - string, integer, float, bytes or None The value that was loaded from the database column - string The name of the column being rendered table - string or None The name of the table - or None if this is a custom SQL query database - string The name of the database datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) , or to execute SQL queries. request - Request object The current request object If your hook returns None , it will be ignored. Use this to indicate that your hook is not able to custom render this particular value. If the hook returns a string, that string will be rendered in the table cell. If you want to return HTML markup you can do so by returning a jinja2.Markup object. You can also return an awaitable function which returns a value. Datasette will loop through all available render_cell hooks and display the value returned by the first one that does not return None . Here is an example of a custom render_cell() plugin which looks for values that are a JSON string matching the following format: {""href"": ""https://www.example.com/"", ""label"": ""Name""} If the value matches that pattern, the plugin returns an HTML link element: from datasette import hookimpl import markupsafe import json @hookimpl def render_cell(value): # Render {""href"": ""..."", ""label"": ""...""} as link if not isinstance(value, str): return None stripped = value.strip() if not ( stripped.startswith(""{"") and stripped.endswith(""}"") ): return None try: data = json.loads(value) except ValueError: return None if not isinstance(data, dict): return None if set(data.keys()) != {""href"", ""label""}: return None href = data[""href""] if not ( href.startswith(""/"") or href.startswith(""http://"") or href.startswith(""https://"") ): return None return markupsafe.Markup( '<a href=""{href}"">{label}</a>'.format( href=markupsafe.escape(data[""href""]), label=markupsafe.escape(data[""label""] or """") or "" "", ) ) Examples: datasette-render-binary , datasette-render-markdown , datasette-json-html",441, 615,publish_subcommand(publish),"publish - Click publish command group The Click command group for the datasette publish subcommand This hook allows you to create new providers for the datasette publish command. Datasette uses this hook internally to implement the default cloudrun and heroku subcommands, so you can read their source to see examples of this hook in action. Let's say you want to build a plugin that adds a datasette publish my_hosting_provider --api_key=xxx mydatabase.db publish command. Your implementation would start like this: from datasette import hookimpl from datasette.publish.common import ( add_common_publish_arguments_and_options, ) import click @hookimpl def publish_subcommand(publish): @publish.command() @add_common_publish_arguments_and_options @click.option( ""-k"", ""--api_key"", help=""API key for talking to my hosting provider"", ) def my_hosting_provider( files, metadata, extra_options, branch, template_dir, plugins_dir, static, install, plugin_secret, version_note, secret, title, license, license_url, source, source_url, about, about_url, api_key, ): ... Examples: datasette-publish-fly , datasette-publish-vercel",441, 614,"extra_body_script(template, database, table, columns, view_name, request, datasette)","Extra JavaScript to be added to a <script> block at the end of the <body> element on the page. This takes the same arguments as extra_template_vars(...) The template , database , table and view_name options can be used to return different code depending on which template is being rendered and which database or table are being processed. The datasette instance is provided primarily so that you can consult any plugin configuration options that may have been set, using the datasette.plugin_config(plugin_name) method documented above. This function can return a string containing JavaScript, or a dictionary as described below, or a function or awaitable function that returns a string or dictionary. Use a dictionary if you want to specify that the code should be placed in a <script type=""module"">...</script> element: @hookimpl def extra_body_script(): return { ""module"": True, ""script"": ""console.log('Your JavaScript goes here...')"", } This will add the following to the end of your page: <script type=""module"">console.log('Your JavaScript goes here...')</script> Example: datasette-cluster-map",441, 613,"extra_js_urls(template, database, table, columns, view_name, request, datasette)","This takes the same arguments as extra_template_vars(...) This works in the same way as extra_css_urls() but for JavaScript. You can return a list of URLs, a list of dictionaries or an awaitable function that returns those things: from datasette import hookimpl @hookimpl def extra_js_urls(): return [ { ""url"": ""https://code.jquery.com/jquery-3.3.1.slim.min.js"", ""sri"": ""sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo"", } ] You can also return URLs to files from your plugin's static/ directory, if you have one: @hookimpl def extra_js_urls(): return [""/-/static-plugins/your-plugin/app.js""] Note that your-plugin here should be the hyphenated plugin name - the name that is displayed in the list on the /-/plugins debug page. If your code uses JavaScript modules you should include the ""module"": True key. See Custom CSS and JavaScript for more details. @hookimpl def extra_js_urls(): return [ { ""url"": ""/-/static-plugins/your-plugin/app.js"", ""module"": True, } ] Examples: datasette-cluster-map , datasette-vega",441, 612,"extra_css_urls(template, database, table, columns, view_name, request, datasette)","This takes the same arguments as extra_template_vars(...) Return a list of extra CSS URLs that should be included on the page. These can take advantage of the CSS class hooks described in Custom pages and templates . This can be a list of URLs: from datasette import hookimpl @hookimpl def extra_css_urls(): return [ ""https://stackpath.bootstrapcdn.com/bootstrap/4.1.0/css/bootstrap.min.css"" ] Or a list of dictionaries defining both a URL and an SRI hash : @hookimpl def extra_css_urls(): return [ { ""url"": ""https://stackpath.bootstrapcdn.com/bootstrap/4.1.0/css/bootstrap.min.css"", ""sri"": ""sha384-9gVQ4dYFwwWSjIDZnLEWnxCjeSWFphJiwGPXr1jddIhOegiu1FwO5qRGvFXOdJZ4"", } ] This function can also return an awaitable function, useful if it needs to run any async code: @hookimpl def extra_css_urls(datasette): async def inner(): db = datasette.get_database() results = await db.execute( ""select url from css_files"" ) return [r[0] for r in results] return inner Examples: datasette-cluster-map , datasette-vega",441, 611,"extra_template_vars(template, database, table, columns, view_name, request, datasette)","Extra template variables that should be made available in the rendered template context. template - string The template that is being rendered, e.g. database.html database - string or None The name of the database, or None if the page does not correspond to a database (e.g. the root page) table - string or None The name of the table, or None if the page does not correct to a table columns - list of strings or None The names of the database columns that will be displayed on this page. None if the page does not contain a table. view_name - string The name of the view being displayed. ( index , database , table , and row are the most important ones.) request - Request object or None The current HTTP request. This can be None if the request object is not available. datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) This hook can return one of three different types: Dictionary If you return a dictionary its keys and values will be merged into the template context. Function that returns a dictionary If you return a function it will be executed. If it returns a dictionary those values will will be merged into the template context. Function that returns an awaitable function that returns a dictionary You can also return a function which returns an awaitable function which returns a dictionary. Datasette runs Jinja2 in async mode , which means you can add awaitable functions to the template scope and they will be automatically awaited when they are rendered by the template. Here's an example plugin that adds a ""user_agent"" variable to the template context containing the current request's User-Agent header: @hookimpl def extra_template_vars(request): return {""user_agent"": request.headers.get(""user-agent"")} This example returns an awaitable function which adds a list of hidden_table_names to the context: @hookimpl def extra_template_vars(datasette, database): async def hidden_table_names(): if database: db = datasette.databases[database] return { ""hidden_table_names"": await db.hidden_table_names() } else: return {} return hidden_table_names And here's an example which adds a sql_first(sql_query) function which executes a SQL statement and returns the first column of the first row of results: @hookimpl def extra_template_vars(datasette, database): async def sql_first(sql, dbname=None): dbname = ( dbname or database or next(iter(datasette.databases.keys())) ) result = await datasette.execute(dbname, sql) return result.rows[0][0] return {""sql_first"": sql_first} You can then use the new function in a template like so: SQLite version: {{ sql_first(""select sqlite_version()"") }} Examples: datasette-search-all , datasette-template-sql",441, 610,Page extras,These plugin hooks can be used to affect the way HTML pages for different Datasette interfaces are rendered.,441, 609,"prepare_jinja2_environment(env, datasette)","env - jinja2 Environment The template environment that is being prepared datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) This hook is called with the Jinja2 environment that is used to evaluate Datasette HTML templates. You can use it to do things like register custom template filters , for example: from datasette import hookimpl @hookimpl def prepare_jinja2_environment(env): env.filters[""uppercase""] = lambda u: u.upper() You can now use this filter in your custom templates like so: Table name: {{ table|uppercase }} This function can return an awaitable function if it needs to run any async code. Examples: datasette-edit-templates",441, 608,"prepare_connection(conn, database, datasette)","conn - sqlite3 connection object The connection that is being opened database - string The name of the database datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) This hook is called when a new SQLite database connection is created. You can use it to register custom SQL functions , aggregates and collations. For example: from datasette import hookimpl import random @hookimpl def prepare_connection(conn): conn.create_function( ""random_integer"", 2, random.randint ) This registers a SQL function called random_integer which takes two arguments and can be called like this: select random_integer(1, 10); prepare_connection() hooks are not called for Datasette's internal database . Examples: datasette-jellyfish , datasette-jq , datasette-haversine , datasette-rure",441, 607,Plugin hooks,"Datasette plugins use plugin hooks to customize Datasette's behavior. These hooks are powered by the pluggy plugin system. Each plugin can implement one or more hooks using the @hookimpl decorator against a function named that matches one of the hooks documented on this page. When you implement a plugin hook you can accept any or all of the parameters that are documented as being passed to that hook. For example, you can implement the render_cell plugin hook like this even though the full documented hook signature is render_cell(row, value, column, table, database, datasette) : @hookimpl def render_cell(value, column): if column == ""stars"": return ""*"" * int(value) List of plugin hooks prepare_connection(conn, database, datasette) prepare_jinja2_environment(env, datasette) Page extras extra_template_vars(template, database, table, columns, view_name, request, datasette) extra_css_urls(template, database, table, columns, view_name, request, datasette) extra_js_urls(template, database, table, columns, view_name, request, datasette) extra_body_script(template, database, table, columns, view_name, request, datasette) publish_subcommand(publish) render_cell(row, value, column, table, database, datasette, request) register_output_renderer(datasette) register_routes(datasette) register_commands(cli) register_facet_classes() register_actions(datasette) The resources_sql() method asgi_wrapper(datasette) startup(datasette) canned_queries(datasette, database, actor) actor_from_request(datasette, request) actors_from_ids(datasette, actor_ids) jinja2_environment_from_request(datasette, request, env) filters_from_request(request, database, table, datasette) permission_resources_sql(datasette, actor, action) Permission plugin examples Allow Alice to view a specific table Restrict execute-sql to a database prefix Read permissions from a custom table Default deny with an exception register_magic_parameters(datasette) forbidden(datasette, request, message) handle_exception(datasette, request, exception) skip_csrf(datasette, scope) menu_links(datasette, actor, request) Action hooks table_actions(datasette, actor, database, table, request) view_actions(datasette, actor, database, view, request) query_actions(datasette, actor, database, query_name, request, sql, params) row_actions(datasette, actor, request, database, table, row) database_actions(datasette, actor, database, request) homepage_actions(datasette, actor, request) Template slots top_homepage(datasette, request) top_database(datasette, request, database) top_table(datasette, request, database, table) top_row(datasette, request, database, table, row) top_query(datasette, request, database, sql) top_canned_query(datasette, request, database, query_name) Event tracking track_event(datasette, event) register_events(datasette)",441, 606,Streaming all records,"The stream all rows option is designed to be as efficient as possible - under the hood it takes advantage of Python 3 asyncio capabilities and Datasette's efficient pagination to stream back the full CSV file. Since databases can get pretty large, by default this option is capped at 100MB - if a table returns more than 100MB of data the last line of the CSV will be a truncation error message. You can increase or remove this limit using the max_csv_mb config setting. You can also disable the CSV export feature entirely using allow_csv_stream .",441, 605,URL parameters,"The following options can be used to customize the CSVs returned by Datasette. ?_header=off This removes the first row of the CSV file specifying the headings - only the row data will be returned. ?_stream=on Stream all matching records, not just the first page of results. See below. ?_dl=on Causes Datasette to return a content-disposition: attachment; filename=""filename.csv"" header.",441, 604,CSV export,"Any Datasette table, view or custom SQL query can be exported as CSV. To obtain the CSV representation of the table you are looking, click the ""this data as CSV"" link. You can also use the advanced export form for more control over the resulting file, which looks like this and has the following options: download file - instead of displaying CSV in your browser, this forces your browser to download the CSV to your downloads directory. expand labels - if your table has any foreign key references this option will cause the CSV to gain additional COLUMN_NAME_label columns with a label for each foreign key derived from the linked table. In this example the city_id column is accompanied by a city_id_label column. stream all rows - by default CSV files only contain the first max_returned_rows records. This option will cause Datasette to loop through every matching record and return them as a single CSV file. You can try that out on https://latest.datasette.io/fixtures/facetable?_size=4",441, 603,Facet by date,"If Datasette finds any columns that contain dates in the first 100 values, it will offer a faceting interface against the dates of those values. This works especially well against timestamp values such as 2019-03-01 12:44:00 . Example here: latest.datasette.io/fixtures/facetable?_facet_date=created",441, 602,Facet by JSON array,"If your SQLite installation provides the json1 extension (you can check using /-/versions ) Datasette will automatically detect columns that contain JSON arrays of values and offer a faceting interface against those columns. This is useful for modelling things like tags without needing to break them out into a new table. Example here: latest.datasette.io/fixtures/facetable?_facet_array=tags",441, 601,Speeding up facets with indexes,"The performance of facets can be greatly improved by adding indexes on the columns you wish to facet by. Adding indexes can be performed using the sqlite3 command-line utility. Here's how to add an index on the state column in a table called Food_Trucks : sqlite3 mydatabase.db SQLite version 3.19.3 2017-06-27 16:48:08 Enter "".help"" for usage hints. sqlite> CREATE INDEX Food_Trucks_state ON Food_Trucks(""state""); Or using the sqlite-utils command-line utility: sqlite-utils create-index mydatabase.db Food_Trucks state",441, 600,Suggested facets,"Datasette's table UI will suggest facets for the user to apply, based on the following criteria: For the currently filtered data are there any columns which, if applied as a facet... Will return 30 or less unique options Will return more than one unique option Will return less unique options than the total number of filtered rows And the query used to evaluate this criteria can be completed in under 50ms That last point is particularly important: Datasette runs a query for every column that is displayed on a page, which could get expensive - so to avoid slow load times it sets a time limit of just 50ms for each of those queries. This means suggested facets are unlikely to appear for tables with millions of records in them.",441, 599,Facets in metadata,"You can turn facets on by default for specific tables by adding them to a ""facets"" key in a Datasette Metadata file. Here's an example that turns on faceting by default for the qLegalStatus column in the Street_Tree_List table in the sf-trees database: [[[cog from metadata_doc import metadata_example metadata_example(cog, { ""databases"": { ""sf-trees"": { ""tables"": { ""Street_Tree_List"": { ""facets"": [""qLegalStatus""] } } } } }) ]]] [[[end]]] Facets defined in this way will always be shown in the interface and returned in the API, regardless of the _facet arguments passed to the view. You can specify array or date facets in metadata using JSON objects with a single key of array or date and a value specifying the column, like this: [[[cog metadata_example(cog, { ""facets"": [ {""array"": ""tags""}, {""date"": ""created""} ] }) ]]] [[[end]]] You can change the default facet size (the number of results shown for each facet) for a table using facet_size : [[[cog metadata_example(cog, { ""databases"": { ""sf-trees"": { ""tables"": { ""Street_Tree_List"": { ""facets"": [""qLegalStatus""], ""facet_size"": 10 } } } } }) ]]] [[[end]]]",441, 598,Facets in query strings,"To turn on faceting for specific columns on a Datasette table view, add one or more _facet=COLUMN parameters to the URL. For example, if you want to turn on facets for the city_id and state columns, construct a URL that looks like this: /dbname/tablename?_facet=state&_facet=city_id This works for both the HTML interface and the .json view. When enabled, facets will cause a facet_results block to be added to the JSON output, looking something like this: { ""state"": { ""name"": ""state"", ""results"": [ { ""value"": ""CA"", ""label"": ""CA"", ""count"": 10, ""toggle_url"": ""http://...?_facet=city_id&_facet=state&state=CA"", ""selected"": false }, { ""value"": ""MI"", ""label"": ""MI"", ""count"": 4, ""toggle_url"": ""http://...?_facet=city_id&_facet=state&state=MI"", ""selected"": false }, { ""value"": ""MC"", ""label"": ""MC"", ""count"": 1, ""toggle_url"": ""http://...?_facet=city_id&_facet=state&state=MC"", ""selected"": false } ], ""truncated"": false } ""city_id"": { ""name"": ""city_id"", ""results"": [ { ""value"": 1, ""label"": ""San Francisco"", ""count"": 6, ""toggle_url"": ""http://...?_facet=city_id&_facet=state&city_id=1"", ""selected"": false }, { ""value"": 2, ""label"": ""Los Angeles"", ""count"": 4, ""toggle_url"": ""http://...?_facet=city_id&_facet=state&city_id=2"", ""selected"": false }, { ""value"": 3, ""label"": ""Detroit"", ""count"": 4, ""toggle_url"": ""http://...?_facet=city_id&_facet=state&city_id=3"", ""selected"": false }, { ""value"": 4, ""label"": ""Memnonia"", ""count"": 1, ""toggle_url"": ""http://...?_facet=city_id&_facet=state&city_id=4"", ""selected"": false } ], ""truncated"": false } } If Datasette detects that a column is a foreign key, the ""label"" property will be automatically derived from the detected label column on the referenced table. The default number of facet results returned is 30, controlled by the default_facet_size setting. You can increase this on an individual page by adding ?_facet_size=100 to the query string, up to a maximum of max_returned_rows (which defaults to 1000).",441, 597,Facets,"Datasette facets can be used to add a faceted browse interface to any database table. With facets, tables are displayed along with a summary showing the most common values in specified columns. These values can be selected to further filter the table. Here's an example : Facets can be specified in two ways: using query string parameters, or in metadata.json configuration for the table.",441, 596,Table schema,"Use /database-name/table-name/-/schema to see the schema for a specific table. The .md and .json extensions work here too. The JSON returns an object with ""database"" , ""table"" , and ""schema"" keys.",441, 595,Database schema,"Use /database-name/-/schema to see the complete schema for a specific database. The .md and .json extensions work here too. The JSON returns an object with ""database"" and ""schema"" keys.",441, 594,Instance schema,"Access /-/schema to see the complete schema for all attached databases in the Datasette instance. Use /-/schema.md to get the same information as Markdown. Use /-/schema.json to get the same information as JSON, which looks like this: { ""schemas"": [ { ""database"": ""content"", ""schema"": ""create table posts ..."" } }",441, 593,Schemas,Datasette offers /-/schema endpoints to expose the SQL schema for databases and tables.,441, 592,Row,"Every row in every Datasette table has its own URL. This means individual records can be linked to directly. Table cells with extremely long text contents are truncated on the table view according to the truncate_cells_html setting. If a cell has been truncated the full length version of that cell will be available on the row page. Rows which are the targets of foreign key references from other tables will show a link to a filtered search for all records that reference that row. Here's an example from the Registers of Members Interests database: ../people/uk~2Eorg~2Epublicwhip~2Fperson~2F10001 Note that this URL includes the encoded primary key of the record. Here's that same page as JSON: ../people/uk~2Eorg~2Epublicwhip~2Fperson~2F10001.json",441, 591,Table,"The table page is the heart of Datasette: it allows users to interactively explore the contents of a database table, including sorting, filtering, Full-text search and applying Facets . The HTML interface is worth spending some time exploring. As with other pages, you can return the JSON data by appending .json to the URL path, before any ? query string arguments. The query string arguments are described in more detail here: Table arguments You can also use the table page to interactively construct a SQL query - by applying different filters and a sort order for example - and then click the ""View and edit SQL"" link to see the SQL query that was used for the page and edit and re-submit it. Some examples: ../items lists all of the line-items registered by UK MPs as potential conflicts of interest. It demonstrates Datasette's support for Full-text search . ../antiquities-act%2Factions_under_antiquities_act is an interface for exploring the ""actions under the antiquities act"" data table published by FiveThirtyEight. ../global-power-plants?country_long=United+Kingdom&primary_fuel=Gas is a filtered table page showing every Gas power plant in the United Kingdom. It includes some default facets (configured using its metadata.json ) and uses the datasette-cluster-map plugin to show a map of the results.",441, 590,Queries,"The /database-name/-/query page can be used to execute an arbitrary SQL query against that database, if the execute-sql permission is enabled. This query is passed as the ?sql= query string parameter. This means you can link directly to a query by constructing the following URL: /database-name/-/query?sql=SELECT+*+FROM+table_name Each configured canned query has its own page, at /database-name/query-name . Viewing this page will execute the query and display the results. In both cases adding a .json extension to the URL will return the results as JSON.",441, 589,Hidden tables,"Some tables listed on the database page are treated as hidden. Hidden tables are not completely invisible - they can be accessed through the ""hidden tables"" link at the bottom of the page. They are hidden because they represent low-level implementation details which are generally not useful to end-users of Datasette. The following tables are hidden by default: Any table with a name that starts with an underscore - this is a Datasette convention to help plugins easily hide their own internal tables. Tables that have been configured as ""hidden"": true using Hiding tables . *_fts tables that implement SQLite full-text search indexes. Tables relating to the inner workings of the SpatiaLite SQLite extension. sqlite_stat tables used to store statistics used by the query optimizer.",441, 588,Database,"Each database has a page listing the tables, views and canned queries available for that database. If the execute-sql permission is enabled (it's on by default) there will also be an interface for executing arbitrary SQL select queries against the data. Examples: fivethirtyeight.datasettes.com/fivethirtyeight datasette.io/global-power-plants The JSON version of this page provides programmatic access to the underlying data: fivethirtyeight.datasettes.com/fivethirtyeight.json datasette.io/global-power-plants.json",441, 587,Top-level index,"The root page of any Datasette installation is an index page that lists all of the currently attached databases. Some examples: fivethirtyeight.datasettes.com register-of-members-interests.datasettes.com Add /.json to the end of the URL for the JSON version of the underlying data: fivethirtyeight.datasettes.com/.json register-of-members-interests.datasettes.com/.json The index page can also be accessed at /-/ , useful for if the default index page has been replaced using an index.html custom template . The /-/ page will always render the default Datasette index.html template.",441, 586,Pages and API endpoints,"The Datasette web application offers a number of different pages that can be accessed to explore the data in question, each of which is accompanied by an equivalent JSON API.",441, 585,datasette-hashed-urls,"If you open a database file in immutable mode using the -i option, you can be assured that the content of that database will not change for the lifetime of the Datasette server. The datasette-hashed-urls plugin implements an optimization where your database is served with part of the SHA-256 hash of the database contents baked into the URL. A database at /fixtures will instead be served at /fixtures-aa7318b , and a year-long cache expiry header will be returned with those pages. This will then be cached by both browsers and caching proxies such as Cloudflare or Fastly, providing a potentially significant performance boost. To install the plugin, run the following: datasette install datasette-hashed-urls Prior to Datasette 0.61 hashed URL mode was a core Datasette feature, enabled using the hash_urls setting. This implementation has now been removed in favor of the datasette-hashed-urls plugin. Prior to Datasette 0.28 hashed URL mode was the default behaviour for Datasette, since all database files were assumed to be immutable and unchanging. From 0.28 onwards the default has been to treat database files as mutable unless explicitly configured otherwise.",441, 584,HTTP caching,"If your database is immutable and guaranteed not to change, you can gain major performance improvements from Datasette by enabling HTTP caching. This can work at two different levels. First, it can tell browsers to cache the results of queries and serve future requests from the browser cache. More significantly, it allows you to run Datasette behind a caching proxy such as Varnish or use a cache provided by a hosted service such as Fastly or Cloudflare . This can provide incredible speed-ups since a query only needs to be executed by Datasette the first time it is accessed - all subsequent hits can then be served by the cache. Using a caching proxy in this way could enable a Datasette-backed visualization to serve thousands of hits a second while running Datasette itself on extremely inexpensive hosting. Datasette's integration with HTTP caches can be enabled using a combination of configuration options and query string arguments. The default_cache_ttl setting sets the default HTTP cache TTL for all Datasette pages. This is 5 seconds unless you change it - you can set it to 0 if you wish to disable HTTP caching entirely. You can also change the cache timeout on a per-request basis using the ?_ttl=10 query string parameter. This can be useful when you are working with the Datasette JSON API - you may decide that a specific query can be cached for a longer time, or maybe you need to set ?_ttl=0 for some requests for example if you are running a SQL order by random() query.",441, 583,"Using ""datasette inspect""","Counting the rows in a table can be a very expensive operation on larger databases. In immutable mode Datasette performs this count only once and caches the results, but this can still cause server startup time to increase by several seconds or more. If you know that a database is never going to change you can precalculate the table row counts once and store then in a JSON file, then use that file when you later start the server. To create a JSON file containing the calculated row counts for a database, use the following: datasette inspect data.db --inspect-file=counts.json Then later you can start Datasette against the counts.json file and use it to skip the row counting step and speed up server startup: datasette -i data.db --inspect-file=counts.json You need to use the -i immutable mode against the database file here or the counts from the JSON file will be ignored. You will rarely need to use this optimization in every-day use, but several of the datasette publish commands described in Publishing data use this optimization for better performance when deploying a database file to a hosting provider.",441, 582,Immutable mode,"If you can be certain that a SQLite database file will not be changed by another process you can tell Datasette to open that file in immutable mode . Doing so will disable all locking and change detection, which can result in improved query performance. This also enables further optimizations relating to HTTP caching, described below. To open a file in immutable mode pass it to the datasette command using the -i option: datasette -i data.db When you open a file in immutable mode like this Datasette will also calculate and cache the row counts for each table in that database when it first starts up, further improving performance.",441, 581,Performance and caching,"Datasette runs on top of SQLite, and SQLite has excellent performance. For small databases almost any query should return in just a few milliseconds, and larger databases (100s of MBs or even GBs of data) should perform extremely well provided your queries make sensible use of database indexes. That said, there are a number of tricks you can use to improve Datasette's performance.",441, 580,Querying polygons using within(),"The within() SQL function can be used to check if a point is within a geometry: select name from places where within(GeomFromText('POINT(-3.1724366 51.4704448)'), places.geom); The GeomFromText() function takes a string of well-known text. Note that the order used here is longitude then latitude . To run that same within() query in a way that benefits from the spatial index, use the following: select name from places where within(GeomFromText('POINT(-3.1724366 51.4704448)'), places.geom) and rowid in ( SELECT pkid FROM idx_places_geom where xmin < -3.1724366 and xmax > -3.1724366 and ymin < 51.4704448 and ymax > 51.4704448 );",441, 579,Importing GeoJSON polygons using Shapely,"Another common form of polygon data is the GeoJSON format. This can be imported into SpatiaLite directly, or by using the Shapely Python library. Who's On First is an excellent source of openly licensed GeoJSON polygons. Let's import the geographical polygon for Wales. First, we can use the Who's On First Spelunker tool to find the record for Wales: spelunker.whosonfirst.org/id/404227475 That page includes a link to the GeoJSON record, which can be accessed here: data.whosonfirst.org/404/227/475/404227475.geojson Here's Python code to create a SQLite database, enable SpatiaLite, create a places table and then add a record for Wales: import sqlite3 conn = sqlite3.connect(""places.db"") # Enable SpatialLite extension conn.enable_load_extension(True) conn.load_extension(""/usr/local/lib/mod_spatialite.dylib"") # Create the masic countries table conn.execute(""select InitSpatialMetadata(1)"") conn.execute( ""create table places (id integer primary key, name text);"" ) # Add a MULTIPOLYGON Geometry column conn.execute( ""SELECT AddGeometryColumn('places', 'geom', 4326, 'MULTIPOLYGON', 2);"" ) # Add a spatial index against the new column conn.execute(""SELECT CreateSpatialIndex('places', 'geom');"") # Now populate the table from shapely.geometry.multipolygon import MultiPolygon from shapely.geometry import shape import requests geojson = requests.get( ""https://data.whosonfirst.org/404/227/475/404227475.geojson"" ).json() # Convert to ""Well Known Text"" format wkt = shape(geojson[""geometry""]).wkt # Insert and commit the record conn.execute( ""INSERT INTO places (id, name, geom) VALUES(null, ?, GeomFromText(?, 4326))"", (""Wales"", wkt), ) conn.commit()",441, 578,Importing shapefiles into SpatiaLite,"The shapefile format is a common format for distributing geospatial data. You can use the spatialite command-line tool to create a new database table from a shapefile. Try it now with the North America shapefile available from the University of North Carolina Global River Database project. Download the file and unzip it (this will create files called narivs.dbf , narivs.prj , narivs.shp and narivs.shx in the current directory), then run the following: spatialite rivers-database.db SpatiaLite version ..: 4.3.0a Supported Extensions: ... spatialite> .loadshp narivs rivers CP1252 23032 ======== Loading shapefile at 'narivs' into SQLite table 'rivers' ... Inserted 467973 rows into 'rivers' from SHAPEFILE This will load the data from the narivs shapefile into a new database table called rivers . Exit out of spatialite (using Ctrl+D ) and run Datasette against your new database like this: datasette rivers-database.db \ --load-extension=/usr/local/lib/mod_spatialite.dylib If you browse to http://localhost:8001/rivers-database/rivers you will see the new table... but the Geometry column will contain unreadable binary data (SpatiaLite uses a custom format based on WKB ). The easiest way to turn this into semi-readable data is to use the SpatiaLite AsGeoJSON function. Try the following using the SQL query interface at http://localhost:8001/rivers-database : select *, AsGeoJSON(Geometry) from rivers limit 10; This will give you back an additional column of GeoJSON. You can copy and paste GeoJSON from this column into the debugging tool at geojson.io to visualize it on a map. To see a more interesting example, try ordering the records with the longest geometry first. Since there are 467,000 rows in the table you will first need to increase the SQL time limit imposed by Datasette: datasette rivers-database.db \ --load-extension=/usr/local/lib/mod_spatialite.dylib \ --setting sql_time_limit_ms 10000 Now try the following query: select *, AsGeoJSON(Geometry) from rivers order by length(Geometry) desc limit 10;",441, 577,Making use of a spatial index,"SpatiaLite spatial indexes are R*Trees. They allow you to run efficient bounding box queries using a sub-select, with a similar pattern to that used for Searches using custom SQL . In the above example, the resulting index will be called idx_museums_point_geom . This takes the form of a SQLite virtual table. You can inspect its contents using the following query: select * from idx_museums_point_geom limit 10; Here's a live example: timezones-api.datasette.io/timezones/idx_timezones_Geometry pkid xmin xmax ymin ymax 1 -8.601725578308105 -2.4930307865142822 4.162120819091797 10.74019718170166 2 -3.2607860565185547 1.27329421043396 4.539252281188965 11.174856185913086 3 32.997581481933594 47.98238754272461 3.3974475860595703 14.894054412841797 4 -8.66890811920166 11.997337341308594 18.9681453704834 37.296207427978516 5 36.43336486816406 43.300174713134766 12.354820251464844 18.070993423461914 You can now construct efficient bounding box queries that will make use of the index like this: select * from museums where museums.rowid in ( SELECT pkid FROM idx_museums_point_geom -- left-hand-edge of point > left-hand-edge of bbox (minx) where xmin > :bbox_minx -- right-hand-edge of point < right-hand-edge of bbox (maxx) and xmax < :bbox_maxx -- bottom-edge of point > bottom-edge of bbox (miny) and ymin > :bbox_miny -- top-edge of point < top-edge of bbox (maxy) and ymax < :bbox_maxy ); Spatial indexes can be created against polygon columns as well as point columns, in which case they will represent the minimum bounding rectangle of that polygon. This is useful for accelerating within queries, as seen in the Timezones API example.",441, 576,Spatial indexing latitude/longitude columns,"Here's a recipe for taking a table with existing latitude and longitude columns, adding a SpatiaLite POINT geometry column to that table, populating the new column and then populating a spatial index: import sqlite3 conn = sqlite3.connect(""museums.db"") # Lead the spatialite extension: conn.enable_load_extension(True) conn.load_extension(""/usr/local/lib/mod_spatialite.dylib"") # Initialize spatial metadata for this database: conn.execute(""select InitSpatialMetadata(1)"") # Add a geometry column called point_geom to our museums table: conn.execute( ""SELECT AddGeometryColumn('museums', 'point_geom', 4326, 'POINT', 2);"" ) # Now update that geometry column with the lat/lon points conn.execute( """""" UPDATE museums SET point_geom = GeomFromText('POINT('||""longitude""||' '||""latitude""||')',4326); """""" ) # Now add a spatial index to that column conn.execute( 'select CreateSpatialIndex(""museums"", ""point_geom"");' ) # If you don't commit your changes will not be persisted: conn.commit() conn.close()",441, 575,Installing SpatiaLite on Linux,"SpatiaLite is packaged for most Linux distributions. apt install spatialite-bin libsqlite3-mod-spatialite Depending on your distribution, you should be able to run Datasette something like this: datasette --load-extension=/usr/lib/x86_64-linux-gnu/mod_spatialite.so If you are unsure of the location of the module, try running locate mod_spatialite and see what comes back.",441, 574,Installing SpatiaLite on OS X,"The easiest way to install SpatiaLite on OS X is to use Homebrew . brew update brew install spatialite-tools This will install the spatialite command-line tool and the mod_spatialite dynamic library. You can now run Datasette like so: datasette --load-extension=spatialite",441, 573,Installation,,441, 572,Warning,"The SpatiaLite extension adds a large number of additional SQL functions , some of which are not be safe for untrusted users to execute: they may cause the Datasette server to crash. You should not expose a SpatiaLite-enabled Datasette instance to the public internet without taking extra measures to secure it against potentially harmful SQL queries. The following steps are recommended: Disable arbitrary SQL queries by untrusted users. See Controlling the ability to execute arbitrary SQL for ways to do this. The easiest is to start Datasette with the datasette --setting default_allow_sql off option. Define Canned queries with the SQL queries that use SpatiaLite functions that you want people to be able to execute. The Datasette SpatiaLite tutorial includes detailed instructions for running SpatiaLite safely using these techniques",441, 571,SpatiaLite,"The SpatiaLite module for SQLite adds features for handling geographic and spatial data. For an example of what you can do with it, see the tutorial Building a location to time zone API with SpatiaLite . To use it with Datasette, you need to install the mod_spatialite dynamic library. This can then be loaded into Datasette using the --load-extension command-line option. Datasette can look for SpatiaLite in common installation locations if you run it like this: datasette --load-extension=spatialite --setting default_allow_sql off If SpatiaLite is in another location, use the full path to the extension instead: datasette --setting default_allow_sql off \ --load-extension=/usr/local/lib/mod_spatialite.dylib",441, 570,FTS versions,"There are three different versions of the SQLite FTS module: FTS3, FTS4 and FTS5. You can tell which versions are supported by your instance of Datasette by checking the /-/versions page. FTS5 is the most advanced module but may not be available in the SQLite version that is bundled with your Python installation. Most importantly, FTS5 is the only version that has the ability to order by search relevance without needing extra code. If you can't be sure that FTS5 will be available, you should use FTS4.",441, 569,Configuring FTS by hand,"We recommend using sqlite-utils , but if you want to hand-roll a SQLite full-text search table you can do so using the following SQL. To enable full-text search for a table called items that works against the name and description columns, you would run this SQL to create a new items_fts FTS virtual table: CREATE VIRTUAL TABLE ""items_fts"" USING FTS4 ( name, description, content=""items"" ); This creates a set of tables to power full-text search against items . The new items_fts table will be detected by Datasette as the fts_table for the items table. Creating the table is not enough: you also need to populate it with a copy of the data that you wish to make searchable. You can do that using the following SQL: INSERT INTO ""items_fts"" (rowid, name, description) SELECT rowid, name, description FROM items; If your table has columns that are foreign key references to other tables you can include that data in your full-text search index using a join. Imagine the items table has a foreign key column called category_id which refers to a categories table - you could create a full-text search table like this: CREATE VIRTUAL TABLE ""items_fts"" USING FTS4 ( name, description, category_name, content=""items"" ); And then populate it like this: INSERT INTO ""items_fts"" (rowid, name, description, category_name) SELECT items.rowid, items.name, items.description, categories.name FROM items JOIN categories ON items.category_id=categories.id; You can use this technique to populate the full-text search index from any combination of tables and joins that makes sense for your project.",441, 568,Configuring FTS using csvs-to-sqlite,"If your data starts out in CSV files, you can use Datasette's companion tool csvs-to-sqlite to convert that file into a SQLite database and enable full-text search on specific columns. For a file called items.csv where you want full-text search to operate against the name and description columns you would run the following: csvs-to-sqlite items.csv items.db -f name -f description",441, 567,Configuring FTS using sqlite-utils,"sqlite-utils is a CLI utility and Python library for manipulating SQLite databases. You can use it from Python code to configure FTS search, or you can achieve the same goal using the accompanying command-line tool . Here's how to use sqlite-utils to enable full-text search for an items table across the name and description columns: sqlite-utils enable-fts mydatabase.db items name description",441,