DeepSeek released smallpond, a big data processing framework built on top of Ray.
- Smallpond targets high performance data processing.
- It provides a high-level dataframe API
- Targets petabyte-level scaling
The challenges around training data prep only grow when you include multimodal data.
10 months ago