Trending

Content tagged with "data-engineering"

data-engineering

Hacker News

Top stories from the Hacker News community• Updated 7 minutes ago

InfoQ

Latest articles from InfoQ• Updated 7 minutes ago

InfoQ

Presentation: Lessons Learned From Shipping AI-Powered Healthcare Products

Clara Matos discusses the journey of shipping AI-powered healthcare products at Sword Health. She explains how to implement input/output guardrails for regulated industries and shares a framework for robust evaluations using human and LLM-based ratings. From prompt engineering to RAG and user feedback loops, she shares a data-driven roadmap for building reliable AI care agents at scale. By Clara Matos

infoq.com
InfoQ

Article: Architecture in a Flow of AI-Augmented Change

While AI adoption is surging, most organizations fail to scale past pilots. The solution lies in organizational structure, not just technology. This article details how architects can enable "fast flow" by defining clear domains and guardrails. Learn how to shift from controlling outcomes to curating context, allowing AI to drive continuous, valuable business change. By Jonathan McPhail, Juan Medina, Jake DeCrane, Isuru Wijesundara

infoq.com
Jonathan McPhail, Juan Medina, Jake DeCrane, Isuru Wijesundara
1 day ago
InfoQ

QCon AI New York 2025: Moving Mountains: Migrating Legacy Code in Weeks Instead of Years

David Stein, Principal AI Engineer at ServiceTitan, presented “Moving Mountains: Migrating Legacy Code in Weeks instead of Years” at QCon AI New York 2025. Stein demonstrated how migrations don’t have to be synonymous to “moving mountains” and introduced the concepts of the Principle of Acceleration and the Assembly Line Pattern. By Michael Redlich

infoq.com
InfoQ

Article: NextGen Search - Where AI Meets OpenSearch Through MCP

In this article, authors Srikanth Daggumalli and Arun Lakshmanan discuss next-generation context-aware conversational search using OpenSearch and AI agents powered by Large Language Models (LLMs) and Model Context Protocol (MCP). By Srikanth Daggumalli, Arun Lakshmanan

infoq.com
InfoQ

TornadoVM 2.0 Brings Automatic GPU Acceleration and LLM support to Java

The TornadoVM project recently reached version 2.0, a major milestone for the open-source project that aims to provide a heterogeneous hardware runtime for Java. The project automatically accelerates Java programs on multi-core CPUs, GPUs, and FPGAs. This release is likely to be of particular interest to teams developing LLM solutions on the JVM. By Ben Evans

infoq.com
InfoQ

Presentation: Powering Enterprise AI Applications with Data and Open Source Software

Francisco Javier Arceo explored Feast, the open-source feature store designed to address common data challenges in the AI/ML lifecycle, such as feature redundancy, and low-latency serving at scale. By Francisco Javier Arceo

infoq.com
InfoQ

Yelp Publishes Blueprint for Managing S3 Server-Access Logs at Massive Scale

In a detailed engineering post, Yelp shared how it built a scalable and cost-efficient pipeline for processing Amazon S3 server-access logs (SAL) across its infrastructure, overcoming traditional limitations of raw log storage and querying at high volume. By Craig Risi

infoq.com
InfoQ

Magika 1.0: Smarter, Faster File Detection with Rust and AI

Google has just released version 1.0 of Magika, a substantial rewrite of its open-source file type detection system. The new version leverages AI to support a broader range of file types and is built in Rust for maximum speed and security. By Sergio De Simone

infoq.com
InfoQ

Breaking Silos: Netflix Introduces Upper Metamodel to Bring Consistency Across Content Engineering

Netflix has introduced the Upper metamodel within its Unified Data Architecture (UDA) to standardize domain definitions and generate consistent data container representations. UDA links conceptual models to GraphQL, Avro, SQL, and Java artifacts, supporting projections, mappings, and knowledge graph-based discovery across content, advertising, and operational systems. By Leela Kumili

infoq.com
2

Reddit

Top posts from tech subreddits• Updated 7 minutes ago

Hugging Face Trending

Popular models from Hugging Face• Updated about 1 hour ago

No models found

Try removing the tag filter or searching for different content.

GitHub Trending

Popular repositories from GitHub• Updated 4 minutes ago

cvat

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.

superset

Apache Superset is a Data Visualization and Data Exploration Platform

flink-cdc

Flink CDC is a streaming data integration tool

Data-Science-For-Beginners

10 Weeks, 20 Lessons, Data Science for All!

datahub

The Metadata Platform for your Data and AI Stack

simdjson

Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

redisson

Redisson - Valkey & Redis Java client. Real-Time Data Platform. Sync/Async/RxJava/Reactive API. Over 50 Valkey and Redis based Java objects and services: Set, Multimap, SortedSet, Map, List, Queue, Deque, Semaphore, Lock, AtomicLong, Map Reduce, Bloom filter, Spring, Tomcat, Scheduler, JCache API, Hibernate, RPC, local cache..

feast

The Open Source Feature Store for AI/ML