Trending

Content tagged with "data-engineering"

data-engineering

Hacker News

Top stories from the Hacker News community• Updated 4 minutes ago

148

Reddit

Top posts from tech subreddits• Updated 4 minutes ago

Hugging Face Trending

Popular models from Hugging Face• Updated about 1 hour ago

No models found

Try removing the tag filter or searching for different content.

GitHub Trending

Popular repositories from GitHub• Updated less than a minute ago

pathway

Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.

mindsdb

Federated query engine for AI - The only MCP Server you'll ever need

milvus

Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search

simdjson

Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

juicefs

JuiceFS is a distributed POSIX file system built on top of Redis and S3.

presidio

An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

superset

Apache Superset is a Data Visualization and Data Exploration Platform

redisson

Redisson - Valkey & Redis Java client. Real-Time Data Platform. Sync/Async/RxJava/Reactive API. Over 50 Valkey and Redis based Java objects and services: Set, Multimap, SortedSet, Map, List, Queue, Deque, Semaphore, Lock, AtomicLong, Map Reduce, Bloom filter, Spring, Tomcat, Scheduler, JCache API, Hibernate, RPC, local cache..

airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows