Skip to content
Snippets Groups Projects
Select Git revision
  • master default protected
  • date-parsing
  • v2.2.0
  • v2.1.0
  • v2.0.0
  • v1.0.0
  • v0.1.0
7 results

postgrestutils

  • Clone with SSH
  • Clone with HTTPS
  • beckerfy's avatar
    Close #3 Change API to be less awkward
    Fynn Becker authored
    caeedf26
    History

    postgrestutils

    A very basic POSTGREST client and utils

    Usage

    Setup

    Django

    • add "postgrestutils" to your INSTALLED_APPS setting
    • add POSTGREST_UTILS_BASE_URI (should default to the most frequently used POSTGREST instance in the future) and POSTGREST_UTILS_JWT to your project settings
    from postgrestutils.client import pgrest_client
    
    params = {
        "select": "id,forename",
        "forename": "eq.Jebediah"
    }
    
    # this will send a request to 'POSTGREST_UTILS_BASE_URI/kerbals?select=id,forename&forename=eq.Jebediah'
    res = pgrest_client.filter("kerbals", params=params)

    Other projects

    from postgrestutils.client import pgrest_client
    pgrest_client.configure('your-JWT', base_uri='http://127.0.0.1:3000')
    
    params = {
        "select": "id,forename"
    }
    res = pgrest_client.filter("kerbals", params=params)

    Making requests

    postgrestutils tries to be as intuitive and pythonic as possible while also being efficient. In order to do so it combines several concepts such as lazy evaluation, caching and counting strategies.

    Understanding these concepts allows you to write the most efficient code for your specific use case. If you're already familiar with how django querysets work this should feel fairly natural to you. There are however a few differences that will be explained in detail below.

    Lazy evaluation

    Akin to django querysets postgrestutils has a LazyPostgrestJsonResult that is returned from calls to pgrest_client.filter() without making any API calls yet. If you're familiar with django's rules for evaluation this list won't suprise you. Since there are a few subtle differences however here is what will cause evaluation of a LazyPostgrestJsonResult:

    • Iteration. A LazyPostgrestJsonResult is iterable and will fetch all elements from the API the first time you iterate over it.
    • Slicing. This will fetch the elements in the specified range.
    • repr(). As a convenience implementation for interactive interpreter sessions this will fetch the first 20 elements.
    • len(). Unsurprisingly this returns the count of the requested table. Depending on the counting strategy this has different implications such as the cache being populated.
    • list(). This can be useful to force evaluation of a LazyPostgrestJsonResult.
    • bool(). Using a LazyPostgrestJsonResult in any boolean context will evaluate it.
    • Using the .get() method on pgrest_client. Getting some lazy object when explicitly requesting a single element doesn't make much sense. Like django's Model.objects.get() this will return the requested element or raise a ObjectDoesNotExist/MultipleObjectsReturned if none or multiple objects were found.

    Pagination

    Remember the part about postgrestutils trying to be intuitive and pythonic? Do you know about Python's slicing notation? Great, you already know how pagination works. Just to be sure here is a snippet of pagination in action:

    >>> business_roles = pgrest_client.filter('business_role')
    >>> business_roles[:3]  # fetches the first 3 business roles
    >>> business_roles[3:6]  # fetches the next 3 business roles

    Caching

    Also like django querysets there is a cache to minimize API calls. Here's a short snippet explaining the gist of it:

    # Bad: Fetches the same data from the API twice
    >>> print([role['id'] for role in pgrest_client.filter('business_role')])
    >>> print([role['name'] for role in pgrest_client.filter('business_role')])
    
    # Good: Uses the cache resulting in only a single API request
    >>> business_roles = pgrest_client.filter('business_role')
    >>> print([role['id'] for role in business_roles])  # fetches all elements into the cache
    >>> print([role['name'] for role in business_roles])  # re-uses the cached elements
    When results are not cached

    There are a few cases where a LazyPostgrestJsonResult will not cache results:

    • Indexing and slicing. If the cache is not yet populated indexing and slicing - even on the same index/ranges - will result in an API call.
    # without populated cache
    >>> business_roles = pgrest_client.filter('business_role')
    >>> business_roles[5]  # fetches the 6th element from the API
    >>> business_roles[5]  # fetches the 6th element from the API again
    
    # with populated cache
    >>> business_roles = pgrest_client.filter('business_role')
    >>> list(business_roles)  # fetches all elements from the API
    >>> business_roles[5]  # re-uses the cached elements
    >>> business_roles[5]  # re-uses the cached elements
    • repr(). Since this just returns a slice of itself the cache won't be populated.
    • len() when not using the count=Count.NONE kwarg. Counting strategies other than Count.NONE are not required to fetch all elements in order to determine their length. More on counting strategies.
    A note on caching and len()

    Since calling len() can often be considerably expensive its result is also cached. Any subsequent calls will re-use the cache instead of making any API calls.

    Invalidating the cache

    If you have a LazyPostgrestJsonResult around that you want to re-use but need up-to-date data simply call the .refresh_from_pgrest() method on it. That will lazily refresh data from PostgREST by invalidating the cache. Your object will now behave as if you just created it.

    Counting strategies

    PostgREST currently offers two counting strategies: that is counting and not counting. postgrestutils lets you decide on which to use by specifying the count kwarg.

    Using count=Count.NONE

    If you don't need to know the count for your request this is obviously a good counting strategy to choose. But what happens if you need the count and just call len() on your LazyPostgrestJsonResult anyway? This is again similar to what django querysets do. It will evaluate the LazyPostgrestJsonResult fetching all elements from the API into the cache and return the length of the cache.

    Using count=Count.EXACT

    You've learned that count=Count.NONE will count your elements just fine so why would you ever want to use this option? The reason is quite simple: Fetching all elements for a large table can be expensive; and unnecessarily so if you don't even need them. That's often the case when using pagination. You want to show a subset of all elements but also display how many pages with more elements there are. To do so you need the count of all elements and the first few elements depending on your page size. What you don't need however is all elements so why fetch them?

    This counting strategy allows you to get the count without fetching all elements. So what happens when calling len() on your LazyPostgrestJsonResult this time? postgrestutils will explicitly request the count for your request which will be cheaper for large tables.

    Be careful with this for very large tables however as this can take a very long time as explained in the PostgREST documentation. As also mentioned there future versions will support estimating the count.

    Filtering

    http://postgrest.org/en/stable/api.html

    Django helpers

    custom user_account_fetched signal

    postgrestutils provides a custom signal called user_account_fetched which provides the current request and the account of the current user on login. To use this feature (and you really should 99.9% of the time) configure your settings accordingly by specifying the columns you need from the account endpoint:

    # settings.py
    POSTGREST_UTILS_AUTOFETCH = 'person_id,email'  # comma-separated string

    When connecting to the signal you will have to provide a callback function. That function is a great place for your custom logic to fetch and store additional information your app frequently needs into the session. You can connect to it like so:

    # apps.py
    from django.apps import AppConfig
    from postgrestutils.signals import user_account_fetched
    from your_app.utils import your_callback_func
    
    
    class YourAppConfig(AppConfig):
        def ready(self, *args, **kwargs):
            super().ready(*args, **kwargs)
            user_account_fetched.connect(your_callback_func)

    Your callback function could look something like this:

    # utils.py
    from postgrestutils.client import pgrest_client
    
    
    def your_callback_func(sender, **kwargs):
        request = kwargs['request']
        account = kwargs['account']
        # fetching some addtional data your project needs frequently using the pgrest_client (person, memberships etc.)
    
        # 'caching' that data in the session
        request.session['account'] = account
        request.session['person'] = person
        request.session['memberships'] = memberships

    For more information on signals refer to the django docs. They are great. Really.