Trending

Content tagged with "data-engineering"

data-engineering

Hacker News

Top stories from the Hacker News community• Updated 7 minutes ago

Reddit

Top posts from tech subreddits• Updated 16 minutes ago

Hugging Face Trending

Popular models from Hugging Face• Updated 35 minutes ago

No models found

Try removing the tag filter or searching for different content.

GitHub Trending

Popular repositories from GitHub• Updated about 1 hour ago

pathway

Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.

mindsdb

Federated query engine for AI - The only MCP Server you'll ever need

milvus

Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search

simdjson

Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

juicefs

JuiceFS is a distributed POSIX file system built on top of Redis and S3.

presidio

An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

superset

Apache Superset is a Data Visualization and Data Exploration Platform

redisson

Redisson - Valkey & Redis Java client. Real-Time Data Platform. Sync/Async/RxJava/Reactive API. Over 50 Valkey and Redis based Java objects and services: Set, Multimap, SortedSet, Map, List, Queue, Deque, Semaphore, Lock, AtomicLong, Map Reduce, Bloom filter, Spring, Tomcat, Scheduler, JCache API, Hibernate, RPC, local cache..

airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows