Product: Personalised Learning Recommendation System Author: Itisha Dubey Purpose: Map learner and admin journeys to underlying technical components, data flows, and system interactions.
| Layer | Component | Technology Choice | Reason |
|---|---|---|---|
| Event Streaming | Learner behaviour ingestion | Apache Kafka | Handles 50M+ events/day, real-time + batch |
| Data Warehouse | Unified storage | Snowflake / BigQuery | Scalable, supports feature engineering pipelines |
| Feature Engineering | Signal transformation | Apache Spark | Distributed processing of skill vectors, engagement scores |
| Vector Database | RAG content embeddings | Pinecone / Weaviate | Low-latency semantic retrieval for Q&A |
| LLM Backend | Agent reasoning | Claude API (swappable) | Powers all three agents, model-agnostic via gateway |
| Recommendation Models | Collaborative + Content-based | Python / PyTorch + MLflow | Model versioning, retraining pipeline management |
| API Layer | Service communication | REST + GraphQL | REST for standard calls, GraphQL for flexible frontend queries |
| Notification Service | Nudges and re-engagement | Firebase (push) + SendGrid (email) | Reliable delivery, open/click tracking |
| Auth & Identity | Unified learner profile | OAuth 2.0 + JWT | Single identity across web, mobile, API |
| Frontend | Learner, Instructor, Admin surfaces | React (Web) + React Native (Mobile) | Shared component library across surfaces |
| Infrastructure | Deployment | AWS / GCP (containerised via Kubernetes) | Independent scaling per service |
| Monitoring | Drift detection + logging | Datadog + custom alerting | Real-time recommendation drift alerts |
Learner Events → Data Collection Layer
POST /events/learner
{
learner_id: string,
event_type: enum [click, watch, skip, complete, search, quiz_score],
course_id: string,
timestamp: ISO8601,
metadata: {
watch_duration_seconds: int,
quiz_score: float,
module_id: string
}
}
Non Functional Requirement in play: Must support 50M+ events/day ingestion without bottlenecking model retraining.
Learner Profile
{
learner_id: string,
career_goal: string,
current_role: string,
skill_vector: float[], // updated after every completion
engagement_score: float, // rolling 30-day
dropout_probability: float, // recomputed daily
learning_style: {
preferred_format: enum [video, text, mixed],
affinity_score: float,
last_updated: ISO8601
},
data_tier: enum [default, opted_in], // privacy tier
retention_expiry: ISO8601
}
Course Metadata
{
course_id: string,
title: string,
skill_tags: string[], // auto-generated by AI Metadata Agent on upload
difficulty_score: float, // 1-5, AI-generated
prerequisites: string[], // instructor-flagged + AI-inferred
format: enum [video, text, live, mixed],
duration_hours: float,
quality_score: float, // computed from ratings + completion rate
embedding_vector: float[], // stored in vector DB for RAG
rag_indexed: boolean,
rag_indexed_at: ISO8601
}
Recommendation Log (every recommendation served is logged — NFR: Monitoring)
{
log_id: string,
learner_id: string,
course_id: string,
model_version: string,
signals_used: string[],
rank_at_serve: int,
served_at: ISO8601,
outcome: enum [clicked, enrolled, ignored, completed],
outcome_recorded_at: ISO8601
}
| Scenario | Fallback Behaviour |
|---|---|
| Recommendation engine returns no results | Fall back to content-based model only, surface top-rated courses in learner's declared interest area |
| RAG pipeline exceeds 2s response time | Return partial answer with "Based on available course information..." + prompt to view full syllabus |
| LLM API unavailable | Agents switch to rule-based nudges; Learning Path Agent serves static path until LLM recovers |
| Learner skips onboarding | Cold-start defaults to most popular courses in stated career category until 10 behavioural events are recorded |
| Dropout probability model not yet trained for new learner | Re-engagement Agent uses time-based rules (no activity for 5 days) as fallback trigger |
Learner Journey → Tech Component Mapping