@@ -24,7 +24,7 @@ with postgrestutils.Session() as s:
...
@@ -24,7 +24,7 @@ with postgrestutils.Session() as s:
By default constructing a new `postgrestutils.Session` will take the settings discussed in [setup](#setup) into account.
By default constructing a new `postgrestutils.Session` will take the settings discussed in [setup](#setup) into account.
Hence there is no need to specify `base_uri` or `token` explicitly unless you are using more than one API or database role in your project.
Hence there is no need to specify `base_uri` or `token` explicitly unless you are using more than one API or database role in your project.
Additionally `postgrestutils.Session` takes `schema: Optional[str] = None`, `parse_dt: bool = True` and `count: Count = Count.NONE` (some of which are explained later on).
Additionally `postgrestutils.Session` takes `schema: Optional[str] = None`, `parse_dt: bool = True` and `count: postgrestutils.Count = postgrestutils.Count.NONE` (some of which are explained later on).
These options are session defaults and may be overridden on a per-request basis, e.g.
These options are session defaults and may be overridden on a per-request basis, e.g.
```python
```python
...
@@ -85,7 +85,7 @@ This can be useful to force evaluation of a `JsonResultSet`.
...
@@ -85,7 +85,7 @@ This can be useful to force evaluation of a `JsonResultSet`.
Using a `JsonResultSet` in any boolean context will evaluate it.
Using a `JsonResultSet` in any boolean context will evaluate it.
- Using the `.get()` method on a session.
- Using the `.get()` method on a session.
Getting some lazy object when explicitly requesting a single element doesn't make much sense.
Getting some lazy object when explicitly requesting a single element doesn't make much sense.
Like django's `Model.objects.get()` this will return the requested element or raise a `ObjectDoesNotExist`/`MultipleObjectsReturned` if none or multiple objects were found.
Like django's `Model.objects.get()` this will return the requested element or raise a `postgrestutils.ObjectDoesNotExist`/`postgrestutils.MultipleObjectsReturned` if none or multiple objects were found.
#### Pagination
#### Pagination
...
@@ -142,8 +142,8 @@ If the cache is not yet populated indexing and slicing - even on the same index/
...
@@ -142,8 +142,8 @@ If the cache is not yet populated indexing and slicing - even on the same index/
-`repr()`.
-`repr()`.
Since this just returns a slice of itself the cache won't be populated.
Since this just returns a slice of itself the cache won't be populated.
-`len()` when not using the `count=Count.NONE` kwarg.
-`len()` when not using the `count=postgrestutils.Count.NONE` kwarg.
Counting strategies other than `Count.NONE` are not required to fetch all elements in order to determine their length.
Counting strategies other than `postgrestutils.Count.NONE` are not required to fetch all elements in order to determine their length.
[More on counting strategies.](#counting-strategies)
[More on counting strategies.](#counting-strategies)
##### A note on caching and `len()`
##### A note on caching and `len()`
...
@@ -160,19 +160,20 @@ Your object will now behave as if you just created it.
...
@@ -160,19 +160,20 @@ Your object will now behave as if you just created it.
#### Counting strategies
#### Counting strategies
PostgREST currently offers two [counting strategies](http://postgrest.org/en/stable/api.html#limits-and-pagination): that is counting and not counting.
PostgREST currently offers multiple [counting strategies](http://postgrest.org/en/stable/api.html#exact-count).
`postgrestutils` lets you decide on which to use by specifying the `count` kwarg.
`postgrestutils` lets you decide on which to use by specifying the `count` kwarg on a session or passing it on a per-request basis to `.get()` and `.filter()`.
While this document attempts to explain counting strategies sufficiently consulting the linked PostgREST documentation may be insightful at times.
##### Using `count=Count.NONE`
##### Using `count=postgrestutils.Count.NONE`
If you don't need to know the count for your request this is obviously a good counting strategy to choose.
If you don't need to know the count for your request this is obviously a good counting strategy to choose.
But what happens if you need the count and just call `len()` on your `JsonResultSet` anyway?
But what happens if you need the count and just call `len()` on your `JsonResultSet` anyway?
This is again similar to what django querysets do.
This is again similar to what django querysets do.
It will evaluate the `JsonResultSet` fetching all elements from the API into the cache and return the length of the cache.
It will evaluate the `JsonResultSet` fetching all elements from the API into the cache and return the length of the cache.
##### Using `count=Count.EXACT`
##### Using `count=postgrestutils.Count.EXACT`
You've learned that `count=Count.NONE` will count your elements just fine so why would you ever want to use this option?
You've learned that `count=postgrestutils.Count.NONE` will count your elements just fine so why would you ever want to use this option?
The reason is quite simple: Fetching all elements for a large table can be expensive; and unnecessarily so if you don't even need them.
The reason is quite simple: Fetching all elements for a large table can be expensive; and unnecessarily so if you don't even need them.
That's often the case when using pagination.
That's often the case when using pagination.
You want to show a subset of all elements but also display how many pages with more elements there are.
You want to show a subset of all elements but also display how many pages with more elements there are.
...
@@ -184,7 +185,26 @@ So what happens when calling `len()` on your `JsonResultSet` this time?
...
@@ -184,7 +185,26 @@ So what happens when calling `len()` on your `JsonResultSet` this time?
`postgrestutils` will explicitly request the count for your request which will be cheaper for large tables.
`postgrestutils` will explicitly request the count for your request which will be cheaper for large tables.
Be careful with this for very large tables however as this can take a very long time as explained in the [PostgREST documentation](http://postgrest.org/en/stable/admin.html#count-header-dos).
Be careful with this for very large tables however as this can take a very long time as explained in the [PostgREST documentation](http://postgrest.org/en/stable/admin.html#count-header-dos).
As also mentioned there future versions will support estimating the count.
##### Using `count=postgrestutils.Count.PLANNED`
Now what?
Your table is very large, `postgrestutils.Count.EXACT` takes too long and `postgrestutils.Count.NONE` is out of question entirely.
`postgrestutils.Count.PLANNED` to the rescue.
Using this counting strategy you explicitly instruct the client to leverage PostgreSQL statistics collected from `ANALYZE`ing tables.
This will yield fairly accurate results depending on how often new statistics are collected.
##### Using `count=postgrestutils.Count.ESTIMATED`
So `postgrestutils.Count.NONE` and `postgrestutils.Count.EXACT` are feasible for small tables.
For very large tables those either take too long or require too much memory and `postgrestutils.Count.ESTIMATED` is the only viable alternative.
However `postgrestutils.Count.PLANNED` can potentially lead to deviations even for small tables where they are quite notable.
If only we could have the best of both worlds...
Enter `postgrestutils.Count.ESTIMATED`.
The idea is quite simple: `postgrestutils.Count.ESTIMATED` uses the `postgrestutils.Count.EXACT` strategy up to a certain threshold then falls back to the `postgrestutils.Count.PLANNED` strategy.
That threshold is defined by settings `max-rows` in your PostgREST configuration which will also limit the amount of rows fetched per request.