Database Index Design

In this post we will show how to design a inventory system to manage stock of e-PIN/vouchers efficiently with proper indexing.

One e-PIN typically consists of three properties:

  • serial: a unique identification of e-PIN
  • password: a confidential string of characters/digits
  • expiry date: a period that e-PIN is eligible for redemption

To allow e-PIN from different providers to be stored in the same table, a product field is also needed.

class Item(models.Model):
    product = models.ForeignKey(...)
    serial = models.CharField(max_length=...)
    password = models.CharField(max_length=...)
    expiry = models.DateField()
    class Meta:
        unique_together = ('product', 'serial')

Here we use a joint unique key to avoid duplication of e-PIN under the same provider.

The stock operation will be as simple as creation of records. For delivery, we have to find an efficient way to pick one 'unsold' item from the stock and track it as 'sold'. We lists several possible implementations and show the trade-off.

Solution I - Random Pick

The most trivial approach is picking a record randomly and delete it from the stock. The downside is that there is no simple way to avoid the collision that multiple concurrent pickers will choose on the same record, though it is possible to detect the collision with the lock incurred in the delete operation. If collision occurs, the picker may need to try again(possibly and again) until no collision is detected any more, or reporting an error to caller even if the stock is available.

Solution II - Filtering Pick

This approach picks a record by filtering records on certain conditions like 'earliest expiry first'. Record also can be marked as 'deleted' after delivery. This approach shares the same drawbacks of Solution I. What's more, the filtering may be slow due to table scanning if no proper index is utilized.

Solution III - Incremental Pick

The approach used in AirPay Counter Server uses an additional table to track the next item to be delivered.

class Stock(models.Model):
    product = models.ForeignKey(Product, unique=True)
    sequence = models.IntegerField()
    available = models.IntegerField()

An additional field sequence running from 1,2,3,...to n is also introduced in Item model to identify the order of each item.

class Item(models.Model):
    product = models.ForeignKey(...)
    sequence = models.IntegerField()
    serial = models.CharField(max_length=...)
    password = models.CharField(max_length=...)
    expiry = models.DateField()
    class Meta:
        unique_together = (('product', 'serial'),
                           ('product', 'sequence'))

The delivery process simply increments the sequence to get the next item to be delivered:

with transaction.atomic:
    stock = Stock.objects.filter(product=...)
    selected = stock.filter(available__gt=0)\          
                    .update(sequence=F('sequence')+1,  
                            available=F('available')-1)
    if not selected:
        raise OutofStock
    item = Item.objects.get(product=..., 
                            sequence=stock[0].sequence)

Concurrent pickers will succeed one after another as long as the stock is available. Collision is avoided due to the lock in UPDATE statement inside the transaction body. The query is efficient because at most 1 item is scanned.

At the ending of this post, please consider the following questions:

  • Why filter(available__gt=0) can not be put on the first line?
  • Why F operator is needed? Is there any other alternative solution without using F operator?
  • How many database queries are involved in the above transaction?