Django

Fix Django “Illegal mix of collations”

I have json file contains text non ASCII such kanji, tagalog etc.

{
    "Peru": ["S.A. - Sociedad Anónima", "S.A.A. - Sociedad Anónima Abierta"],

    "Philippines": ["Co. - Company", "Coop. - Cooperative", "Corp. - Corporation", "Ent. - Enterprise", "Inc. - Incorporated", "LLC - Limited Liability Company", "Ltd. - Limited", "Ltd. Co. - Limited Company", "Cía - Compañía", "SA - Sociedad Anónima"],

    "Poland": ["jednoosobowa działalność gospodarcza", "Przedsiębiorstwo Państwowe", "S.A. - spółka akcyjna", "s.c. - spółka cywilna", "S.K.A. - spółka komandytowo-akcyjna", "sp.j. - spółka jawna", "sp.k. - spółka komandytowa", "sp.p. - spółka partnerska", "Sp. z o.o. - spółka z ograniczoną odpowiedzialnością", "Spółdzielnia"],

    "Portugal": ["CRL - Cooperativa de Responsabilidade Limitada", "S.A. - Sociedade Anónima", "S.A. - Sociedade Aberta", "S.F. - Sociedade Fechada", "Lda. - Limitada", "Unipessoal Lda.", "SGPS - Sociedade Gestora de Participações Sociais"],

    "Romania": ["SNC - Societatea în nume colectiv", "SCS - Societatea în comandită simplă", "SCA - Societatea în comandită pe acțiuni", "SA - Societatea pe acțiuni", "SRL - Societatea cu răspundere limitată", "SRL cu proprietar unic - Societatea cu răspundere limitată cu proprietar unic", "S.A. - Societate pe Acţiuni", "S.C.A. - societate în comandită pe acţiuni", "S.C.S. - societate în comandită simplă", "S.N.C. - societate în nume colectiv", "S.R.L. - societate cu răspundere limitată", "PFA - persoana fizica autorizata", "O.N.G. - Organizație Non-Guvernamentală"],

    "Rusia": ["Nekommercheskaya organizatsiya/некоммерческая организация", "GP/ГП, GUP/ГУП - Gosudarstvennoye unitarnoye predpriyatie/Государственное унитарное предприятие", "IP/ИП - Individualny predprinimatel/Индивидуальный предприниматель", "OOO - Obshchestvo s ogranichennoy otvetstvennostyu/Общество с ограниченной ответственностью", "ПAO - Publichnoye aktsionernoye obshchestvo/Публичное акционерное общество", "kooperativ/кооператив", "AO - Aktsionernoye obshchestvo/Акционерное общество", "Prostoye Tovarishestvo - general partnership", "Kommanditnoe Tovarishestvo - limited partnership", "Hozyaystvennoe Partnerstvo - business partnership"],

    "Saudi Arabia": ["شركة ذات مسئولية محدودة - Private Limited Company", "شركة مساهمة - Joint-Stock company", "شركة تضامن - General Partnership Company", "شركة التوصية البسيطة - Limited Partnership", "شركة أجنبية - Foreign Company", "مؤسسة فردية - Individual Establishment"],
}

And I’m creating a script to read those data and insert into my database.
I’m using Django using MariaDB as my database and I found this error while inserting to the database.

django.db.utils.OperationalError: (1267, "Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='")

After several minutes find a solution, the issue occurs because python using unicode string but my database doesn’t.

To fix this issue, I just need to update my database table collation in shell following below.

ALTER TABLE my_table_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;

Reference: https://stackoverflow.com/a/7083511/1936697

Advertisements

Handle Base64 Encoded Binary File in DRF

I’m writing a simple class for DRF field for handling base64 string for attachment both image and file. I’m using DRF 3 at this case.

And here is how to use it following below.

class MyModelSerializer(serializers.ModelSerializer):
    company_logo = Base64CharField(required=False)

    class Meta:
        model = MyModel
        fields = '__all__'

Feel free to use and improve my code. 😀

Add HTTP_REFERER in Django Test Client

I’m testing my views in my django project and got “None” error because my view returning redirect to “HTTP_REFERER” that I get from HTTP Header.

In django test client apparently all request does not contain any HTTP header.
Here is my views:

class DeleteCategoryView(TemplateView):

    def post(self, request, *args, **kwargs):
        ...
        return redirect(request.META.get('HTTP_REFERER'))

And here is how I add “HTTP_REFERER” in my test client.

def test_delete_view(self):
    url = reverse('category:url-adm-delete')
    redirect_url = reverse('category:url-adm-category')
    response = self.client.post(path=url, data={'id': instance.pk}, HTTP_REFERER=redirect_url)
    self.assertRedirects(response, index_url)

Reference: http://stackoverflow.com/a/11819426/1936697

Queryset Optimization Case Study 1

Today I’m revisit my legacy work that I have done long time a go. I’m developing admin panel that displaying list of product and including display total income that we have got in each items.

Below are my model schema:

class Item(models.Model):
    ...
    brand = models.ForeignKey(Brand)
    owner = models.ForeignKey(User)
    base_price = models.ForeignKey(BasePricing)
    categories = models.ManyToManyField(Category)

    @property
    def total_rental_income(self):
        """Getting total income."""
        return self.booking_set.aggregate(total=Sum('order__fee'))

class Booking(models.Model):
    ...
    item = models.ForeignKey(Item)

class Order(models.Model):
    ...
    booking = models.ForeignKey(Booking)
    fee = models.DecimalField(max_digits=12, decimal_places=2, default=0.00)

Currently I just select the data with simple queryset and display 50 items per page.
For your information, I’m profiling my queryset using django-querycount and django-debug-toolbar.

items = Item.objects.all()

Here is my query count:

|------|-----------|----------|----------|----------|------------|
| Type | Database  |   Reads  |  Writes  |  Totals  | Duplicates |
|------|-----------|----------|----------|----------|------------|
| RESP |  default  |   314    |    0     |   314    |     43     |
|------|-----------|----------|----------|----------|------------|
Total queries: 314 in 40.3557s

And here is the detail duplicate query that captured by django debug toolbar.
screen-shot-2016-12-30-at-5-01-10-pm

Now I realized I have to optimize this queryset. My first attempt to fix this issue are using “select_related” all foreign keys and “prefetch_related” my many to many field to select the data eagerly. So my queryset will load all at once instead of the lazy load (default django orm behavior).

items = Item.objects.all().select_related(
    'brand', 'owner', 'base_price'
).prefetch_related(
    'categories'
)

Here is the query count result after my first optimization.

|------|-----------|----------|----------|----------|------------|
| Type | Database  |   Reads  |  Writes  |  Totals  | Duplicates |
|------|-----------|----------|----------|----------|------------|
| RESP |  default  |    74    |    0     |    74    |     9      |
|------|-----------|----------|----------|----------|------------|
Total queries: 74 in 18.0481s

According the query count result, the query much better than before. 314 reads reduced into 74 reads and was 43 duplicates reduced into 9 duplicates.
I did not attach my django debug toolbar because previous captured queries already disappeared. But there’s still one thing that still duplicate from my model schema above. 50 duplicates in my template.
screen-shot-2016-12-30-at-5-22-10-pm

I just know this, that I called my custom property from “Item” model in my template and this custom property is always evaluated and not affected with our optimization above.

Now, here is my second optimization to fix my aggregation issue. I remove my property in my “Item” model and uses “annotate” in my queryset.

items = Item.objects.all().select_related(
    'brand', 'owner', 'base_price'
).prefetch_related(
    'categories'
).annotate(
    rental_income=Sum('booking__order__fee')
)

# And I call my income calculation within my template loop.
{{ item.rental_income|default:0 }}

And here is my query count result.

|------|-----------|----------|----------|----------|------------|
| Type | Database  |   Reads  |  Writes  |  Totals  | Duplicates |
|------|-----------|----------|----------|----------|------------|
| RESP |  default  |    18    |    0     |    18    |     9      |
|------|-----------|----------|----------|----------|------------|
Total queries: 18 in 11.4635s

Now my queryset way better than before, 74 queries reduced into 18 queries.
I still had 9 duplicate query and I’m still working on it. Maybe i’ll update this post if I can optimize the whole queryset. 😀

References:
https://docs.djangoproject.com/en/dev/ref/models/querysets/#select-related
https://docs.djangoproject.com/en/dev/ref/models/querysets/#prefetch-related
https://docs.djangoproject.com/en/dev/topics/db/aggregation/#aggregation

Django: Invalidate Cached Property

Sometimes we create property within our model for utility helper. And django provide decorator “@cached_property” for caching heavy computation within our property.

class Member(models.Model):
    # field definitions.

    @cached_property
    def score(self):
        return # heavy computation/query

I got a problem once testing my model and the property is still load the cached version of data instead of the latest. Here is how to invalidate cached property.

del member.score          # Invalidate cache.
delattr(member, 'score')  # Alternative to invalidate cache.

Reference: https://docs.djangoproject.com/en/dev/ref/utils/#django.utils.functional.cached_property