Monorepo Structure
GeoFlow is organized as a monorepo using Turbo for build orchestration and Bun as the package manager. The repository contains multiple applications and packages:Service Architecture
GeoFlow runs as a distributed system with several interconnected services, orchestrated via Docker Compose.Core Services
GeoFlow App (apps/geoflow)
- Technology: React 19, TanStack Router, Vite
- Purpose: Web interface for workflow design, data visualization, and system monitoring
- Port: 3000 (development)
- Key Features:
- Drag-and-drop workflow builder
- Real-time execution monitoring
- Data upload and download
- User authentication and authorization
Convex Backend (apps/backend)
- Technology: Convex (self-hosted)
- Purpose: Data persistence, real-time subscriptions, and authentication
- Ports: 3210 (backend), 3211 (site proxy), 6791 (dashboard)
- Key Features:
- Real-time data synchronization
- User management and authentication
- Workflow and execution metadata storage
- File upload coordination
PostgreSQL + PostGIS (postgres)
- Technology: PostgreSQL 15 with PostGIS 3.3
- Purpose: Spatial data storage and querying
- Port: 5432
- Key Features:
- Geospatial data types and functions
- Spatial indexing and queries
- Coordinate system transformations
- Large dataset handling
Processing Services
PDAL Worker (packages/worker)
- Technology: Node.js, PDAL, GDAL
- Purpose: High-performance point cloud and geospatial data processing
- Port: 3002
- Key Features:
- LiDAR data processing
- Point cloud filtering and transformation
- Raster processing
- Format conversion (LAS, LAZ, GeoTIFF, etc.)
Motia Workflow Engine (apps/motia)
- Technology: Motia framework, Node.js
- Purpose: Orchestrates complex geospatial processing pipelines
- Port: 4010
- Key Features:
- Event-driven workflow execution
- Step-based processing pipelines
- Error handling and retry logic
- Parallel processing capabilities
Supporting Services
MCP Server (packages/mcp)
- Technology: Python, FastAPI, GeoPandas, PySAL
- Purpose: AI-powered geospatial analysis functions
- Port: 8000
- Key Features:
- Spatial statistics and analysis
- Machine learning model integration
- Geospatial AI assistants
- Custom analysis functions
Data Flow
Workflow Execution Flow
- Workflow Design: User creates workflow in GeoFlow App
- Storage: Workflow definition stored in Convex
- Trigger: Workflow execution initiated via API or UI
- Orchestration: Motia Engine coordinates execution steps
- Processing: Individual steps executed by appropriate services
- Data Storage: Results stored in PostgreSQL/PostGIS
- Notification: Real-time updates sent to UI via Convex
Data Storage Strategy
- Metadata: Workflow definitions, execution logs, user data → Convex
- Spatial Data: Geospatial datasets, processed results → PostGIS
- Files: Raw uploads, temporary processing files → Local storage
- Cache: Frequently accessed data → Redis (planned)
Development vs Production
Development Mode
- Hot reloading enabled for all services
- Volume mounts for live code changes
- Simplified configurations
- Debug logging enabled
- Local database with sample data
Production Mode
- Optimized builds and images
- Environment-specific configurations
- Proper secrets management
- Monitoring and logging
- Backup strategies
Networking and Communication
Services communicate through:- HTTP APIs: RESTful endpoints for service-to-service communication
- WebSockets: Real-time updates via Convex
- Database Connections: Direct PostgreSQL connections for data access
- File System: Shared volumes for large data transfers
- Docker Networks: Isolated service communication
Scalability Considerations
- Horizontal Scaling: Most services can be scaled horizontally
- Load Balancing: Nginx or similar for API services
- Database Sharding: PostGIS supports partitioning for large datasets
- Caching: Redis integration planned for performance optimization
- Storage: Support for S3-compatible storage for large files
Security Architecture
- Authentication: JWT tokens via Better Auth
- Authorization: Role-based access control
- Network Security: Service isolation via Docker networks
- Data Encryption: TLS for external communications
- Secrets Management: Environment variables and Docker secrets