Trending

Content tagged with "data-engineering"

data-engineering

Hacker News

Top stories from the Hacker News community• Updated 6 minutes ago

Reddit

Top posts from tech subreddits• Updated 16 minutes ago

Hugging Face Trending

Popular models from Hugging Face• Updated 34 minutes ago

No models found

Try removing the tag filter or searching for different content.

GitHub Trending

Popular repositories from GitHub• Updated about 1 hour ago

arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics

lance

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..

martin

Blazing fast and lightweight PostGIS, MBtiles and PMtiles tile server, tile generation, and mbtiles tooling.

superset

Apache Superset is a Data Visualization and Data Exploration Platform

redisson

Redisson - Valkey and Redis Java client. Real-Time Data Platform. Sync/Async/RxJava/Reactive API. Over 50 Valkey and Redis based Java objects and services: Set, Multimap, SortedSet, Map, List, Queue, Deque, Semaphore, Lock, AtomicLong, Map Reduce, Bloom filter, Spring, Tomcat, Scheduler, JCache API, Hibernate, RPC, local cache..

dora

DORA (Dataflow-Oriented Robotic Architecture) is middleware designed to streamline and simplify the creation of AI-based robotic applications. It offers low latency, composable, and distributed dataflow capabilities. Applications are modeled as directed graphs, also referred to as pipelines.

simdjson

Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

milvus

Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search

faiss

A library for efficient similarity search and clustering of dense vectors.