DynamoDB is a powerful NoSQL database provided by AWS, well-suited for handling large-scale data with low latency. In this tutorial, we'll walk through how to ingest data from the Open Price Engine API into a DynamoDB table using Python and the boto3
library.
Prerequisites
To follow this tutorial, you'll need:
- AWS Account: Ensure you have DynamoDB access.
- Python 3.x: Installed on your system.
- AWS CLI: Installed and configured.
- openpricengine API key
boto3
Library: Install it via pip:
bash
Copy code
pip install boto3
Setting Up DynamoDB
We'll start by creating a DynamoDB table to store the product pricing data.
Creating the DynamoDB Table
python
Copy code
import boto3
# Initialize a session using Amazon DynamoDB
dynamodb = boto3.resource('dynamodb', region_name='us-west-2')
# Create the DynamoDB table
table = dynamodb.create_table(
TableName='ProductPrices',
KeySchema=[
{
'AttributeName': 'ProductName',
'KeyType': 'HASH' # Partition key
},
{
'AttributeName': 'Date',
'KeyType': 'RANGE' # Sort key
}
],
AttributeDefinitions=[
{
'AttributeName': 'ProductName',
'AttributeType': 'S'
},
{
'AttributeName': 'Date',
'AttributeType': 'S'
}
],
ProvisionedThroughput={
'ReadCapacityUnits': 5,
'WriteCapacityUnits': 5
}
)
# Wait until the table exists
table.meta.client.get_waiter('table_exists').wait(TableName='ProductPrices')
print("Table status:", table.table_status)
This code creates a table named ProductPrices
with ProductName
as the partition key and Date
as the sort key. This schema allows us to store price information for different products over time.
Ingesting Data from the API
Next, we will insert data from the Open Price Engine API into our DynamoDB table.
python
Copy code
import boto3
# Data from the Open Price Engine API
data = [
{
"Country": "Qatar",
"Continent": "Asia",
"Currency": "QAR",
"Store": "Grand Hyper Mekaines",
"Category": "Groceries",
"Product URL": "https://mekaines.grandhyper.com/Apple-Pink-Lady-South-Africa-1Kg",
"Image URL": "https://mekaines.grandhyper.com/image/cache/catalog/products/IMG2722023/4565-370x370.jpg",
"Product Name": "Apple Pink Lady South Africa 1Kg (Approx) - تفاح بينك ليدي جنوب أفريقي 1كج (تقريبا)",
"Price over time": [
{"Date": "2024-08-20", "Price": 9},
{"Date": "2024-08-21", "Price": 9},
{"Date": "2024-08-22", "Price": 9},
{"Date": "2024-08-23", "Price": 9},
{"Date": "2024-08-24", "Price": 9},
{"Date": "2024-08-25", "Price": 9},
{"Date": "2024-08-26", "Price": 9}
]
},
]
# Initialize DynamoDB resource
dynamodb = boto3.resource('dynamodb', region_name='us-west-2')
table = dynamodb.Table('ProductPrices')
# Insert data into the table
for product in data:
product_name = product["Product Name"]
for price_entry in product["Price over time"]:
table.put_item(
Item={
'ProductName': product_name,
'Date': price_entry['Date'],
'Price': price_entry['Price'],
'Country': product['Country'],
'Continent': product['Continent'],
'Currency': product['Currency'],
'Store': product['Store'],
'Category': product['Category'],
'ProductURL': product['Product URL'],
'ImageURL': product['Image URL']
}
)
print("Data inserted successfully!")
This script iterates over the list of products and their associated price history, inserting each entry into the DynamoDB table.
Querying Data
You can query the DynamoDB table to retrieve pricing information for a specific product on a particular date:
python
Copy code
response = table.query(
KeyConditionExpression=boto3.dynamodb.conditions.Key('ProductName').eq('Apple Pink Lady South Africa 1Kg (Approx) - تفاح بينك ليدي جنوب أفريقي 1كج (تقريبا)') &
boto3.dynamodb.conditions.Key('Date').eq('2024-08-26')
)
items = response.get('Items', [])
for item in items:
print(item)
This code queries the ProductPrices
table for the product "Apple Pink Lady South Africa 1Kg" on the date "2024-08-26" and prints the result.
Deleting Items
If you need to remove a specific price entry, you can do so with the delete_item
method:
python
Copy code
table.delete_item(
Key={
'ProductName': 'Apple Pink Lady South Africa 1Kg (Approx) - تفاح بينك ليدي جنوب أفريقي 1كج (تقريبا)',
'Date': '2024-08-26'
}
)
print("Item deleted successfully!")
This script deletes the entry for "Apple Pink Lady South Africa 1Kg" on "2024-08-26".
Conclusion
In this article, we've demonstrated how to create a DynamoDB table and populate it with data from the Open Price Engine API using Python. We covered creating the table, inserting data, querying for specific entries, and deleting items. With these tools, you can effectively manage and analyze time-series data in DynamoDB, allowing for scalable and efficient data management.