Technologies used: AWS, Lambda, DynamoDB, Python.
I am not very experienced in DynamoDB/NoSQL and my case is the following:
- I need to store messages sent to users, each user (identified by user_id) can have multiple messages (identified by message_id).
- I need to send a notification to the user about all his/her messages stored in the table.
- Notifications need to be sent at a specified time based on the user setting.
- Users can have multiple notification times set - this is not limited, so one user may want to be notified once a day for example at noon and the other user may want to be notified for ex. 4 times a day (ex. 7.15, 11:00, 15:00 and 18:00), full flexibility here is preferred.
There will be a lambda running every couple of minutes to get the messages that I need to notify the users about. Lambda "knows what time it is" and wants to get only messages from users who want to get their notifications at this point based on their preferences.
Current DynamodDB table design is the following: user_messages table - Primary Key (Partition Key: user_id, Sort Key: message_id), attributes: message_text, creation_time etc.
My struggle is - how to design the DB in an optimal way to limit the number of RCUs consumed and compute time on lambda when extracting those messages. It would be simpler if I'd allow each user to have only one notification time set. I'd just create a notification time attribute and a new GSI where the notification would be the partition key but this would limit the user too much.
I am not sure how to approach it in case of multiple notification times per user, got 2 possible scenarios now:
1. limit the notification setting time to N, for example, 3 max per user, store the preferences in 3 attributes and create 3 GSI's, in such case the lambda would query the table 3 times each run - this doesn't look elegant and I am concerned about the hard limit on the number of notifications.
The table design would look like this in such case: user_messages table - Primary Key (Partition Key: user_id, Sort Key: message_id), attributes: message_text, creation_time etc., GSI_1 (notfication_time_1), GSI_2 (notification_time_2), GSI_3 (notification_time_3).
2. create a separate table with user preferences, like Partition Key: notification_time, attribute: user_id
In such a case the lambda would have to get all user_id for a particular notification time and iterate over user_messages_table to get user messages, which means if I have 1000 users to notify I'd need to query user_messages_table 1000 times. Doesn't look good from the performance point of view and will consume a lot of RCUs.
I am stuck here as none of the above solutions seems optimal for me.
Do you see any other approach I could take here?