duckdb-proxy
duckdb-proxy is a proxy for accessing DuckDB over HTTP. It exposes very simple and powerful endpoints that allow arbitrary execution of SQL statements and returns JSON results.
Because of DuckDB's flexibility, (i.e. it's ability to directly query files on the filesystem), this proxy should not exposed directly to untrusted users.
Usage
Download duckdb-proxy from the project release page.
duckdb-proxy is configured via command line arguments
# can be run without any arguments, will open/create a "db.duckdb" file
$ ./duckdb-proxy
# can be given a path to an existing DuckDB database
# (the DB is created if it doesn't exist, but the directory must already exist)
$ ./duckdb-proxy db/mydatabase.duckdb
# open the database in readonly (DB must exist in this case)
$ ./duckdb-proxy --readonly db/mydatabase.duckdb
You can quickly test the proxy:
$ curl http://localhost:8012/api/1/exec \
-d '{"sql": "select version()"}'
{
"cols": ["version()"],
"types": ["varchar"],
"rows": [
["v0.8.1"]
]
Configuration
--help
Print help message and exit
--port PORT
The port for the HTTP server to listen on. default: 8012
--address ADDRESS
The address for the HTTP server to listen on. default: 127.0.0.1
--readonly
Opens the database in readonly mode (using the DuckDB access_mode=read_only
configuration). When set, consider also setting --with_wrap
.
--with_wrap
When set, duckdb-proxy will wrap the provided SQL in a CTE: with _dproxy as ($SQL) select * from _proxy
. This will help limit the type of statements that duckdb-proxy will allow. When set, consider also setting --readonly
.
--max_limit N
When set, --with-wrap
is automatically enabled, and the wrapping SQL will include a limit N
, ensure that no more than N rows are returned.
--external_access
Allows the database to access external state (using the DuckDB enable_external_access=BOOL
configuration
--pool_size COUNT
The number of connections to the DB duckdb-proxy should keep alive. default: 50
--max_params COUNT
For a given call to /api/1/exec
the maximum number of value that can be passed into params
. default: no limit
--cors_origin ORIGIN
When set, duckdb-proxy will respond to CORS request using the specified origin (as well as any other CORS headers needed for duckdb-proxy to work). default none
--max_request_size BYTES
Will return an HTTP error if the request body is > BYTES. default: 65536
--log_level info | warn | error | fatal | none
The log level to write to use (duckdb-proxy is not overly verbose). default: info
--log_http
Enable logging HTTP requests (1 log per non-404 request)
Supported Types
duckdb-proxy is written in Zig and uses the DuckDB C API. The C API is missing support for binding and reading a few types. These types can still be used in the SQL query itself, but cannot be passed-in as parameters and/or returned as part of a result.
- JSON has to be passed in as a string and will be returned as a string,
- Lists/arrays can be returned, but cannot be passed in,
- No support for structs/maps (TODO: check if this is a limitation of duckdb-proxy or the C API)
API
exec
Executes the specified SQL, returning the result as JSON.
POST /api/1/exec
- Request Parameters
- Sample Request
- Response
- Sample Response
name | type | req | desc |
---|---|---|---|
sql | string | ✓ | The SQL to execute |
params | array | The parameters to bind to the SQL statement. Must have a value for each placeholder in the SQL statement. |
curl "http://localhost:8012/api/1/exec" -d '{
"sql": "select name,value from duckdb_settings() where name like $1",
"params": ["debug%"]
}'
}
name | type | desc |
---|---|---|
cols | string array | List of column names. |
rows | array of arrays | The result as an array of arrays. |
In the case of an insert
, update
or delete
without a returning
clause, a single row with the column Count
will be returned, indicating the number of rows inserted, updated or deleted.
{
"cols": ["name","value"],
"rows": [
["debug_checkpoint_abort","none"],
["debug_force_external","false"],
["debug_force_no_cross_product","false"],
["debug_asof_iejoin","false"],
["debug_window_mode","NULL"]
]
}
As an alternative, POST /api/1/exec/$ID
can also be used, where $ID
is any abitrary value. This can be useful when --log_http is set in order to generate more meaningful access log (since the log's path
attribute will contain the URL, including the $ID
) .
Errors and Codes
duckdb-proxy tries to provide developer-friendly error and validation messages. Every error response has an integer code
field which identifies the error. Every error response also has a err
field which is a description of the error, in English.
For example, a request to an invalid route would return a response with a 404 status code, as well as body with a code
and err
field:
$ curl http://127.0.0.1:8012/hello
{"code": 2, "err": "not found"}
Validation errors follow the same pattern, with each validation type having a distinct code. Validation errors always have a code: 11
and a validation
field that contains a list of errors. Each error contains a field
, code
, err
and optional data
.
For example, the error when a required field is missing looks like:
$ curl "http://localhost:8012/api/1/exec" -d '{}'
{
"code": 11,
"err": "validation error",
"validation": [
{"field": "sql", "code": 1, "err": "is required"}
]
}
The field
value is the full path name. For arrays, the element index is part of the field
:
$ curl "http://localhost:8012/api/1/exec" -d '{
"sql": "select $1::int",
"params": ["hi"]
}'
{
"code": 11,
"err": "validation error",
"validation": [
{"field": "params.0", "code":4, "err": "must be an int"}
]
}
Validation codes 100
indicates that there was a problem preparing or executing the SQL statement. The err
field contains the error provided by DuckDB:
$ curl "http://localhost:8012/api/1/exec" -d '{"sql": "select * from x"}'
{
"code": 11,
"err": "validation error",
"validation": [
{"field": "sql", "code": 100, "err": "Catalog Error: Table with ..."}
]
}