Trending

Content tagged with "data-engineering"

data-engineering

Hacker News

Top stories from the Hacker News community• Updated 7 minutes ago

Reddit

Top posts from tech subreddits• Updated 1 minute ago

Hugging Face Trending

Popular models from Hugging Face• Updated 19 minutes ago

No models found

Try removing the tag filter or searching for different content.

GitHub Trending

Popular repositories from GitHub• Updated 33 minutes ago

yfinance

Download market data from Yahoo! Finance's API

pathway

Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.

mindsdb

AI Analytics Engine that can answer questions over large scale data. - The only MCP Server you'll ever need

starrocks

The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.

ClickHouse

ClickHouse® is a real-time analytics database management system

dolphinscheduler

Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code

seatunnel

SeaTunnel is a multimodal, high-performance, distributed, massive data integration tool.

lance

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..