Analytics
Analytics | News, how-tos, features, reviews, and videos
How to talk to machines: 10 secrets of prompt engineering
Prompt engineering is the newest art of convincing machines to do what humans want. Here are 10 things you need to know about writing LLM prompts.
What is Apache Spark? The big data platform that crushed Hadoop
Fast, flexible, and developer-friendly, Apache Spark is the leading platform for large-scale SQL, batch processing, stream processing, and machine learning.
Snowflake’s Data Clean Room promises to ease analysis of PII data
Free Snowflake Native Application appears to eliminate the complexity of self-managing a data clean room for non-technical users, analysts said.
Steampipe dashboards and benchmarks for your data
Use Powerpipe to visualize and validate data in your own Postgres, SQLite, DuckDB, or MySQL database.
The world needs more (and better) open map data
The best map data and the most advanced mapping features have been proprietary. The Overture Maps Foundation aims to change that.
Evaluating databases for sensor data
Sensor data and IoT applications have special requirements that might be better served by a specialized database. Here’s what to consider.
How a new database architecture supports scale and reliability in TiDB
TiDB is a prime example of an intrinsically scalable and reliable distributed SQL database architecture. Here’s how it works.
LinkedIn open sources lakehouse tool OpenHouse
The tool is already in use at LinkedIn with more than 3,500 managed OpenHouse tables in production, serving more than 550 daily active users.
Why SQL still rules
So many programming languages have come and gone, but SQL remains. And it has a bright future still.
Google delivers Gemini LLM support to BigQuery data warehouse
BigQuery’s update will make it easier to prepare data for AI with speech-to-text and document processing.
3 dynamic use cases for Apache Flink and stream processing
We live in a world in motion. Stream processing allows us to record events in the real world so that we can take action or make predictions that will drive better business outcomes.
What is NumPy? Faster array and matrix math in Python
Learn how this popular Python library accelerates math at scale, especially when paired with tools like Cython and Numba.
How to run R in Visual Studio Code
If you’re an R programmer hoping to try GitHub Copilot, you’ll need to use Microsoft’s Visual Studio Code. Here’s how to set up and use VS Code for R.
What is Microsoft Fabric? A big tech stack for big data
Microsoft Fabric is an end-to-end suite of cloud-based tools for data analytics, encompassing data movement, data storage, data engineering, data integration, data science, real-time analytics, and business intelligence.
Why developers should put the database first
Data is the heart of the user experience, so shouldn’t developers start there? SQLite, NoSQL databases, and abstractions like Neurelo make that far easier to do.
IBM to acquire application modernization assets from Advanced
The assets acquired from Advanced will enhance IBM Consulting’s mainframe application and data modernization services, IBM said.
Get started with Anaconda Python
Anaconda provides a handy GUI, a slew of work environments, and tools to simplify the process of using Python for data science.
AI datasets need to get smaller—and better
Ever-larger datasets for AI training pose big challenges for data engineers and big risks for the models themselves.