Recipe Discovery Platform: Building an ML-Powered Recipe Discovery Platform with Flask and React
Overview
This project is a second iteration of the previous machine learning-powered recipe recommendation system that helps users discover recipes based on their preferences and cooking constraints. Using K-Means clustering and similarity scoring, the platform provides personalized recommendations while maintaining a modern, responsive user interface built with React and Flask. This provides a much enhanced user experience compared to the original Streamlit dashboard.
Live Link
Architecture
ML-Driven Architecture
- Recommendation Engine: K-Means clustering with similarity scoring for personalized recipe discovery
- Data Pipeline: Async web scraping with ingredient enrichment and feature engineering
- Model Persistence: Joblib serialization with automatic model loading and caching
Application Architecture
- Backend Service: Flask with blueprint-based modular architecture and SQLAlchemy ORM
- Frontend: React 19 with TypeScript, React Router, and Tailwind CSS v4
- Database Layer: Dual SQLite/PostgreSQL support with automatic migration
Technologies Used
- Backend:
- Flask 3.1+ with factory pattern and blueprint organization
- SQLAlchemy for ORM with Flask-Login for authentication
- Scikit-learn for ML models and feature scaling
- Async web scraping with aiohttp and BeautifulSoup
- Frontend:
- React 19 with hooks and concurrent features
- TypeScript for type safety across the application
- Tailwind CSS v4 with Headless UI components
- Vite for fast development and optimized builds
- ML Pipeline:
- K-Means clustering for recipe similarity
- Feature engineering with complexity scoring
- StandardScaler for feature normalization
- DevOps:
- Multi-stage Docker builds with frontend compilation
- Railway deployment with PostgreSQL integration
- Automated testing with Pytest and Vitest
Features
- Smart Recommendations:
- ML-powered recipe clustering based on cooking time, complexity, and ingredients
- Text-based search with relevance scoring
- Parameter-based filtering with real-time results
- User Management:
- Secure authentication with Flask-Login sessions
- Password reset via Resend email service
- Rate limiting for abuse prevention
- Recipe Collections:
- Save and organize favorite recipes
- Personal notes and recently viewed tracking
- Export capabilities for recipe data
- Admin Dashboard:
- System health monitoring and email configuration
- Rate limit inspection and maintenance tools
- Data cleanup and analytics endpoints
Development Process
Motivation and Evolution
This project addressed the challenge of recipe discovery in an overwhelming digital cookbook landscape:
- Started with Food.com dataset analysis and feature exploration
- Evolved from simple clustering to comprehensive recommendation system
- Integrated real-time ingredient scraping for enhanced recipe data
- Built responsive frontend to make ML insights accessible
Architecture Decisions
- K-Means over Collaborative Filtering: Chose clustering for better cold-start performance and interpretability
- Flask Blueprints over Monolithic Routes: Modular organization enabling team development and testing
- React SPA over Server-Side Rendering: Client-side routing for better user experience and API separation
- Async Scraping over Batch Processing: Real-time ingredient enrichment with respectful rate limiting
Workflow
- Dataset analysis and exploratory data analysis with visualization
- Feature engineering pipeline with complexity scoring and rating aggregation
- ML model optimization using elbow method and silhouette analysis
- Async ingredient scraping system with SQLite persistence
- Flask API development with comprehensive error handling
- React frontend with TypeScript and component-based architecture
- Integration testing across ML pipeline and API endpoints
Key Advantages
- Sub-100ms recommendation response times through clustering optimization
- 90%+ test coverage ensuring reliability across ML and web components
- Type-safe frontend-to-backend communication with comprehensive API documentation
- Production-ready deployment with automatic SSL and database persistence
Implementation Details
ML Recommendation Pipeline
The recommendation system uses a sophisticated clustering approach:
class RecipeRecommender:
def __init__(self, recipes_df, n_clusters=6):
self.scaler = StandardScaler()
self.kmeans = KMeans(n_clusters=n_clusters, random_state=42)
self.feature_names = ["minutes", "complexity_score", "ingredient_count"]
self._prepare_data()
def recommend_recipes(self, desired_time, desired_complexity, desired_ingredients):
# Create user preference vector
user_input = pd.DataFrame([[desired_time, desired_complexity, desired_ingredients]],
columns=self.feature_names)
user_input_scaled = self.scaler.transform(user_input)
# Find nearest cluster
cluster = self.kmeans.predict(user_input_scaled)[0]
cluster_recipes = self.data[self.data["cluster"] == cluster].copy()
# Calculate similarity distance
cluster_recipes["similarity_distance"] = cluster_recipes.apply(
lambda x: np.sqrt((x["minutes"] - desired_time) ** 2 +
(x["complexity_score"] - desired_complexity) ** 2 +
(x["ingredient_count"] - desired_ingredients) ** 2), axis=1
)
return cluster_recipes.nsmallest(20, "similarity_distance")Async Ingredient Enrichment
Real-time ingredient scraping enhances recipe data quality:
class AsyncIngredientScraper:
async def scrape_dataset(self, recipes_df, batch_size=50, delay_between_batches=1.0):
async with aiohttp.ClientSession() as session:
for batch_start in range(0, len(recipes_df), batch_size):
batch = recipes_df.iloc[batch_start:batch_start + batch_size]
tasks = [
self.scrape_recipe_ingredients(recipe_id, recipe_name)
for recipe_id, recipe_name in batch[['food_recipe_id', 'name']].values
]
await asyncio.gather(*tasks, return_exceptions=True)
await asyncio.sleep(delay_between_batches)React Component Architecture
Frontend components follow modern React patterns with hooks:
export default function SaveButton({ recipeId, onSaveChange }: SaveButtonProps) {
const [isSaved, setIsSaved] = useState(false);
const [isLoading, setIsLoading] = useState(true);
useEffect(() => {
const checkSaveStatus = async () => {
try {
setIsLoading(true);
const saved = await isRecipeSaved(recipeId);
setIsSaved(saved);
} catch (error) {
console.error('Failed to check save status:', error);
} finally {
setIsLoading(false);
}
};
checkSaveStatus();
}, [recipeId]);
const handleToggle = async (e: React.MouseEvent) => {
e.stopPropagation();
try {
setIsToggling(true);
const newSavedState = await toggleSaveRecipe(recipeId);
setIsSaved(newSavedState);
onSaveChange?.(newSavedState);
} catch (error) {
console.error('Failed to toggle save state:', error);
} finally {
setIsToggling(false);
}
};
}Deployment
The application uses a multi-stage Docker build for optimal production deployment:
# Frontend build stage
FROM node:20-alpine AS frontend-build
WORKDIR /app/frontend
COPY food-recipe-recommender/app/frontend/package*.json ./
RUN npm ci
COPY food-recipe-recommender/app/frontend/ ./
RUN npm run build
# Runtime stage
FROM python:3.11-slim AS runtime
WORKDIR /app
# Install Python dependencies
COPY pyproject.toml uv.lock ./
RUN pip install uv && uv sync
# Copy built frontend into Flask templates/static
COPY --from=frontend-build /app/frontend/dist/index.html /app/food-recipe-recommender/app/templates/
COPY --from=frontend-build /app/frontend/dist/assets /app/food-recipe-recommender/app/static/assets
# Start with Waitress WSGI server
CMD ["uv", "run", "waitress-serve", "--listen=0.0.0.0:8080", "--call", "app:create_app"]Railway deployment provides managed PostgreSQL and automatic SSL with environment-based configuration.
Lessons Learned
Throughout this project, I gained valuable insights:
- ML Model Interpretability: K-Means clustering provided better user understanding compared to black-box approaches
- Async Python Complexity: Managing concurrent web scraping required careful rate limiting and error handling
- React State Management: Local state with context proved sufficient for this application scale
- API Design Balance: Finding the sweet spot between RESTful principles and frontend data requirements
- Testing ML Systems: Integration testing with real data uncovered edge cases missed by unit tests
Future Improvements
Planned enhancements include:
- Enhanced ML Features: Collaborative filtering integration and ingredient substitution suggestions
- Social Features: Recipe sharing, user reviews, and community collections
- Mobile App: React Native version with offline recipe storage
- Advanced Search: Semantic search using embeddings and dietary restriction filtering
- Recipe Generation: AI-powered recipe creation based on available ingredients
- Nutrition Integration: Calorie counting and nutritional analysis with meal planning
Performance Metrics
The system demonstrates strong performance characteristics:
- Recommendation Latency: <100ms average response time for 20 recommendations
- Search Performance: <200ms for text-based queries across 200K+ recipes
- Frontend Bundle: <500KB gzipped with code splitting
- Test Coverage: >90% backend, >85% frontend
- Database Queries: Optimized with proper indexing for sub-50ms response times
- Memory Usage: <150MB including loaded ML model in production