Trending
Content tagged with "data-engineering"
Hacker News
Top stories from the Hacker News community• Updated 1 minute ago
InfoQ
Latest articles from InfoQ• Updated 13 minutes ago
Article: Architecture in a Flow of AI-Augmented Change
While AI adoption is surging, most organizations fail to scale past pilots. The solution lies in organizational structure, not just technology. This article details how architects can enable "fast flow" by defining clear domains and guardrails. Learn how to shift from controlling outcomes to curating context, allowing AI to drive continuous, valuable business change. By Jonathan McPhail, Juan Medina, Jake DeCrane, Isuru Wijesundara
QCon AI New York 2025: Moving Mountains: Migrating Legacy Code in Weeks Instead of Years
David Stein, Principal AI Engineer at ServiceTitan, presented “Moving Mountains: Migrating Legacy Code in Weeks instead of Years” at QCon AI New York 2025. Stein demonstrated how migrations don’t have to be synonymous to “moving mountains” and introduced the concepts of the Principle of Acceleration and the Assembly Line Pattern. By Michael Redlich
Article: NextGen Search - Where AI Meets OpenSearch Through MCP
In this article, authors Srikanth Daggumalli and Arun Lakshmanan discuss next-generation context-aware conversational search using OpenSearch and AI agents powered by Large Language Models (LLMs) and Model Context Protocol (MCP). By Srikanth Daggumalli, Arun Lakshmanan
TornadoVM 2.0 Brings Automatic GPU Acceleration and LLM support to Java
The TornadoVM project recently reached version 2.0, a major milestone for the open-source project that aims to provide a heterogeneous hardware runtime for Java. The project automatically accelerates Java programs on multi-core CPUs, GPUs, and FPGAs. This release is likely to be of particular interest to teams developing LLM solutions on the JVM. By Ben Evans
Presentation: Powering Enterprise AI Applications with Data and Open Source Software
Francisco Javier Arceo explored Feast, the open-source feature store designed to address common data challenges in the AI/ML lifecycle, such as feature redundancy, and low-latency serving at scale. By Francisco Javier Arceo
Yelp Publishes Blueprint for Managing S3 Server-Access Logs at Massive Scale
In a detailed engineering post, Yelp shared how it built a scalable and cost-efficient pipeline for processing Amazon S3 server-access logs (SAL) across its infrastructure, overcoming traditional limitations of raw log storage and querying at high volume. By Craig Risi
Magika 1.0: Smarter, Faster File Detection with Rust and AI
Google has just released version 1.0 of Magika, a substantial rewrite of its open-source file type detection system. The new version leverages AI to support a broader range of file types and is built in Rust for maximum speed and security. By Sergio De Simone
Breaking Silos: Netflix Introduces Upper Metamodel to Bring Consistency Across Content Engineering
Netflix has introduced the Upper metamodel within its Unified Data Architecture (UDA) to standardize domain definitions and generate consistent data container representations. UDA links conceptual models to GraphQL, Avro, SQL, and Java artifacts, supporting projections, mappings, and knowledge graph-based discovery across content, advertising, and operational systems. By Leela Kumili
Learnings from Cultivating Machine Learning Engineers as a Team Manager
As an AI team manager, Vivek Gupta stays broadly informed to guide AI experts effectively and drive the team. Engineers need feedback on both technical and interpersonal skills, Gupta mentioned at Dev Summit Boston. He stresses learning time, asking for help, and cross-team collaboration. Mentorship, data handling, and human-in-the-loop validation are key to success for machine learning engineers. By Ben Linders
Top posts from tech subreddits• Updated 31 minutes ago
‘Uniquely evil’: Michigan residents fight against huge data center backed by top tycoons
Hugging Face Trending
Popular models from Hugging Face• Updated 13 minutes ago
No models found
Try removing the tag filter or searching for different content.
GitHub Trending
Popular repositories from GitHub• Updated 28 minutes ago
databend
𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
pimcore
Core Framework for the Open Core Data & Experience Management Platform (PIM, MDM, CDP, DAM, DXP/CMS & Digital Commerce)
API-s-for-OSINT
List of API's for gathering information about phone numbers, addresses, domains etc
dice
DiceDB is an open-source, fast, reactive, in-memory database optimized for modern hardware.
martin
Blazing fast and lightweight PostGIS, MBtiles and PMtiles tile server, tile generation, and mbtiles tooling.
dora
DORA (Dataflow-Oriented Robotic Architecture) is middleware designed to streamline and simplify the creation of AI-based robotic applications. It offers low latency, composable, and distributed dataflow capabilities. Applications are modeled as directed graphs, also referred to as pipelines.