Using DynamoDB with Python: Ingesting Data from the Open Price Engine API

26 August 2024

Software Development

DynamoDB is a powerful NoSQL database provided by AWS, well-suited for handling large-scale data with low latency. In this tutorial, we'll walk through how to ingest data from the Open Price Engine API into a DynamoDB table using Python and the boto3 library.

Prerequisites

To follow this tutorial, you'll need:

  • AWS Account: Ensure you have DynamoDB access.
  • Python 3.x: Installed on your system.
  • AWS CLI: Installed and configured.
  • openpricengine API key
  • boto3 Library: Install it via pip:
bash
Copy code
pip install boto3

Setting Up DynamoDB

We'll start by creating a DynamoDB table to store the product pricing data.

Creating the DynamoDB Table

python
Copy code
import boto3

# Initialize a session using Amazon DynamoDB
dynamodb = boto3.resource('dynamodb', region_name='us-west-2')

# Create the DynamoDB table
table = dynamodb.create_table(
    TableName='ProductPrices',
    KeySchema=[
        {
            'AttributeName': 'ProductName',
            'KeyType': 'HASH'  # Partition key
        },
        {
            'AttributeName': 'Date',
            'KeyType': 'RANGE'  # Sort key
        }
    ],
    AttributeDefinitions=[
        {
            'AttributeName': 'ProductName',
            'AttributeType': 'S'
        },
        {
            'AttributeName': 'Date',
            'AttributeType': 'S'
        }
    ],
    ProvisionedThroughput={
        'ReadCapacityUnits': 5,
        'WriteCapacityUnits': 5
    }
)

# Wait until the table exists
table.meta.client.get_waiter('table_exists').wait(TableName='ProductPrices')

print("Table status:", table.table_status)

This code creates a table named ProductPrices with ProductName as the partition key and Date as the sort key. This schema allows us to store price information for different products over time.

Ingesting Data from the API

Next, we will insert data from the Open Price Engine API into our DynamoDB table.


openpricengine swagger docs

python
Copy code
import boto3

# Data from the Open Price Engine API
data = [
    {
        "Country": "Qatar",
        "Continent": "Asia",
        "Currency": "QAR",
        "Store": "Grand Hyper Mekaines",
        "Category": "Groceries",
        "Product URL": "https://mekaines.grandhyper.com/Apple-Pink-Lady-South-Africa-1Kg",
        "Image URL": "https://mekaines.grandhyper.com/image/cache/catalog/products/IMG2722023/4565-370x370.jpg",
        "Product Name": "Apple Pink Lady South Africa 1Kg (Approx) - تفاح بينك ليدي جنوب أفريقي 1كج (تقريبا)",
        "Price over time": [
            {"Date": "2024-08-20", "Price": 9},
            {"Date": "2024-08-21", "Price": 9},
            {"Date": "2024-08-22", "Price": 9},
            {"Date": "2024-08-23", "Price": 9},
            {"Date": "2024-08-24", "Price": 9},
            {"Date": "2024-08-25", "Price": 9},
            {"Date": "2024-08-26", "Price": 9}
        ]
    },
]

# Initialize DynamoDB resource
dynamodb = boto3.resource('dynamodb', region_name='us-west-2')
table = dynamodb.Table('ProductPrices')

# Insert data into the table
for product in data:
    product_name = product["Product Name"]
    for price_entry in product["Price over time"]:
        table.put_item(
            Item={
                'ProductName': product_name,
                'Date': price_entry['Date'],
                'Price': price_entry['Price'],
                'Country': product['Country'],
                'Continent': product['Continent'],
                'Currency': product['Currency'],
                'Store': product['Store'],
                'Category': product['Category'],
                'ProductURL': product['Product URL'],
                'ImageURL': product['Image URL']
            }
        )

print("Data inserted successfully!")

This script iterates over the list of products and their associated price history, inserting each entry into the DynamoDB table.

Querying Data

You can query the DynamoDB table to retrieve pricing information for a specific product on a particular date:

python
Copy code
response = table.query(
    KeyConditionExpression=boto3.dynamodb.conditions.Key('ProductName').eq('Apple Pink Lady South Africa 1Kg (Approx) - تفاح بينك ليدي جنوب أفريقي 1كج (تقريبا)') &
                           boto3.dynamodb.conditions.Key('Date').eq('2024-08-26')
)

items = response.get('Items', [])
for item in items:
    print(item)

This code queries the ProductPrices table for the product "Apple Pink Lady South Africa 1Kg" on the date "2024-08-26" and prints the result.

Deleting Items

If you need to remove a specific price entry, you can do so with the delete_item method:

python
Copy code
table.delete_item(
    Key={
        'ProductName': 'Apple Pink Lady South Africa 1Kg (Approx) - تفاح بينك ليدي جنوب أفريقي 1كج (تقريبا)',
        'Date': '2024-08-26'
    }
)

print("Item deleted successfully!")

This script deletes the entry for "Apple Pink Lady South Africa 1Kg" on "2024-08-26".

Conclusion

In this article, we've demonstrated how to create a DynamoDB table and populate it with data from the Open Price Engine API using Python. We covered creating the table, inserting data, querying for specific entries, and deleting items. With these tools, you can effectively manage and analyze time-series data in DynamoDB, allowing for scalable and efficient data management.

What's on your mind? Put it in the comments!

Don't hesitate to contact Tony Mudau for more information.

Feel free to share this post!

Share this post :